Nginx Upload limits on Beanstalk Docker

If I am not wrong, nginx only allows you to upload up till max 2Mb of data by default. If you are doing a docker deployment on beanstalk you may to remember to change that not once but twice!

As you may know already, beanstalk creates an EC2 instance to manage the docker environment.
Since EC2 needs to manage the docker environment and serve the web interface as well, it does so by having another nginx instance to serve the nginx within docker. Hence, if you had to modify the nginx settings to allow bigger uploads, you’d have to modify the settings for nginx on both - docker as well as EC2.

    # max upload size
    client_max_body_size 10M;   # adjust to your liking

Also, if you don’t want to have any limit at all for uploads, then just change the client_max_body_size to 0.

    # max upload size
    client_max_body_size 0;

Updating Django Source with Docker Deployments

While deploying docker multiple times, you may not want to copy over your Django source code every time you do a deployment.

Setting up supervisord

Luckily there is an easy way to manage this. Since you are working with Django, there is a good chance that you are also managing the processes (like uwsgi) with supervisord.

Here are some of the steps that you can take with supervisord

  • Set up a new process in supervisord
  • Do not allow it to autorestart since it will be a one-shot process
  • Call another script in any format to update the source code
    • As an example, I use bash to update my source code through git

Here’s a sample code:

    [program:source-updater]
    redirect_stderr = true
    stdout_logfile = /shared/source_code_updater.log
    directory = /ws/
    command = /ws/source_code_updater.sh
    autorestart=False

Updating the source code

Few things are important to note in a docker deployment:

  • Not every commit needs to be deployed
  • Filter your commits to only allow deployable code to be updated on docker
  • Include regression, unit and system tests to be part of your build process
  • Once everything has been confirmed to be working, tag your code so that you know it is worthy of going to docker
  • Another way would be to manage this process through branches and merge only if everything passes
  • docker deployments would build off this merged branch or tagged version
  • This way even if you have made 10 commits while fixing a bug and are still in the process of fixing it, you know it won’t go to docker deployment

With that idea, do a checkout and update the source code according to specific tag:

    git checkout -f tags/your_tag_name
    git pull origin tags/your_tag_name

Telling uwsgi about the updated source code

Once you have updated your source code, you need to re-load the project onto uwsgi so that nginx or apache can pick it up. The simplest way to achieve it using the config parameter of uwsgi: --touch-reload. It will reload uWSGI if the specified file is modified/touched

Just remember to setup supervisord in your Dockerfile with this config parameter.

[program:app-uwsgi]
redirect_stderr = true
stdout_logfile = /var/shared/_uwsgi.log
command = /ws/ve_envs/rwv2/bin/uwsgi --touch-reload=/ws/wsgi.ini --ini /ws/wsgi.ini

You can choose any file. I choose uwsgi.ini because the contents never really need to change in it.

Multiple Virtual Environments in Docker

It may seem like a daunting task to have multiple python projects running in their own virtual environments in docker as you want to manage the running of the tasks from a single source - let’s say supervisord. However, all that is required here is to know that python automatically picks up the location of the virtual environments if you provide full path to the virtual environment’s python.

For example, in my docker environment, I have virtual environment install at the following location:

/ws/ve_envs/rwv1/

To enable a project with this virtual environment, I can run the following:

/ws/ve_envs/rwv1/bin/python3.4 PYTHON_PROJECT_FILE_TO_RUN.py

Similarly, other projects can be set up in the same way.

For example, for running uwsgi I provide the full path for python as follows:

[program:appName]
stdout_logfile = /var/shared/_uwsgi.log
command = /ws/ve_envs/project/bin/uwsgi --touch-reload=/ws/wsgi.ini --ini /ws/wsgi.ini

You might want to read about --touch-reload in my other post.

Sharing folders on Beanstalk Docker

It is very easy to setup volume sharing in docker. You ideally want the following folders to be shared when a new docker is initialized for you:

  • /var/log so that you can keep track of logs
  • nginx specific folders because you will have two instances of nginx running - one on docker and another on EC2. This allows you to share logs
  • your personal workspace or anything that you’d like to share

Here’s how you’d do it. The keyword is VOLUME… in your Dockerfile

VOLUME [ \
    "/var/shared/", \ 
    "/etc/nginx/sites-enabled", \ 
    "/var/log/nginx", \
    "/ws/" \
]

Convert GitHub Wiki to Static Site with themes

I recently wanted to setup a wiki so that I could convert it into a static html site with a proper themes. What could be a possible use case for such a requirement:

  • Manage the documentation of a product internally through git but publish it for clients/world through static site
  • Convert the uncolored wiki to a themed version
  • Allow serving of the wiki through web application frameworks like Django
    • It may allow you to have authentication system as a first step hurdle to stop everybody from giving access

Anyways, I went about the whole process and decided to jot down everything. Here I am taking D3 Wiki as an example which I will be converting into a static site. Let’s begin.

D3 Wiki using pelican

D3 Wiki using pelican

Setup and requirements

What do we need to get started?

  • We will need a static site generator
    • Let’s use pelican for this demo
  • An actual wiki
  • Python environment so that pelican and fabric can be installed

Virtual Environment with pelican

Setup the virtual environment

$ virtualenv ve_blog
$ source ve_blog/bin/activate

Install pelican

$ pip install pelican

Pelican Quickstart

Setup pelican using pelican-quickstart so that all files are setup correctly for creating a static site.

$ pelican-quickstart

Welcome to pelican-quickstart v3.6.3.

This script will help you create a new Pelican-based website.

Please answer the following questions so this script can generate the files
needed by Pelican.

    
> Where do you want to create your new web site? [.] 
> What will be the title of this web site? D3 WIKI
> Who will be the author of this web site? abhi1010
> What will be the default language of this web site? [en] 
> Do you want to specify a URL prefix? e.g., http://example.com   (Y/n) n
> Do you want to enable article pagination? (Y/n) Y
> How many articles per page do you want? [10] 
> What is your time zone? [Europe/Paris] Asia/Singapore
> Do you want to generate a Fabfile/Makefile to automate generation and publishing? (Y/n) Y
> Do you want an auto-reload & simpleHTTP script to assist with theme and site development? (Y/n) Y
> Do you want to upload your website using FTP? (y/N) N
> Do you want to upload your website using SSH? (y/N) N
> Do you want to upload your website using Dropbox? (y/N) N
> Do you want to upload your website using S3? (y/N) N
> Do you want to upload your website using Rackspace Cloud Files? (y/N) N
> Do you want to upload your website using GitHub Pages? (y/N) N
Done. Your new project is available at /Users/apandey/code/githubs/d3wiki

Get the wiki

$ git clone https://github.com/mbostock/d3.wiki.git

Cloning into 'd3.wiki'...
remote: Counting objects: 12026, done.
remote: Compressing objects: 100% (67/67), done.
remote: Total 12026 (delta 607), reused 552 (delta 552), pack-reused 11407
Receiving objects: 100% (12026/12026), 9.92 MiB | 1.49 MiB/s, done.
Resolving deltas: 100% (7595/7595), done.
Checking connectivity... done.

Setting the wiki as content for pelican

$ rmdir content
$ ln -s dr.wiki content

Why simple pelican won’t work and what to do

If you tried to simply call pelican command to build the static site, you will notice a lot of errors like:

$ fab build

RROR: Skipping ./请求.md: could not find information about 'NameError: title'
ERROR: Skipping ./过渡.md: could not find information about 'NameError: title'
ERROR: Skipping ./选择器.md: could not find information about 'NameError: title'
ERROR: Skipping ./选择集.md: could not find information about 'NameError: title'
Done: Processed 0 articles, 0 drafts, 0 pages and 0 hidden pages in 3.47 seconds.

The problem is that pelican expects some variables to be defined in each markdown file before it can build the static file. Some of the variables are:

  • Title
  • Slug
  • Date

You may add your own ones as well that you want. However, for our initial purposes, we will keep it simple and just try to add these.

Next, how do we achieve this automation? fab is our answer.

Let’s write a function in python that will modify the markdown files and update them to add Title, Slug, Date

We will edit fabfile.py and add a new function create_wiki:

def create_wiki():
    files = []
    # Find all markdown files in content folder 
    for f in os.walk('./content/'):
        fpath = lambda x: os.path.join(f[0], x)
        for file in f[2]:
            fullpath = fpath(file)
            # print('f = {}'.format(fullpath))
            files.append(fullpath)
    filtered = [f for f in files if f.endswith('.md')]
    for file in filtered:
        with open(file, 'r+') as f:
            content = f.read()
            f.seek(0, 0)
            base = os.path.basename(file).replace('.md', '') 
            lines = ['Title: {}'.format(base.replace('-', ' ')),
                    'Slug: {}'.format(base),
                    'Date: 2015-08-07T14:59:18-04:00',
                    '', '']
            line = '\n'.join(lines)
            # Add the lines to the file
            f.write(line + '\n' + content)
        print(file)
    
    # build and serve the website
    build()
    serve()

Now you can call this function easily:

fab create_wiki

The website can now be viewed at http://localhost:8000

What happened to the menu?

There is a minor issue here though, you will notice that the menu is not available - it is all empty. It is an easy addition. We will need to add some lines to publishconf.py to say what the menu is gonna be.

For my example, I have chosen to show up the following for D3:

  • API Reference
  • Tutorials
  • Plugins
# We don't want all pages to show up in menu
DISPLAY_PAGES_ON_MENU = False

# Choose the specific pages that should be part of menu
MENUITEMS = ( 
    ('HOME', '/home.html'),
    ('API Reference', '/API-Reference.html'),
    ('Tutorials', '/Tutorials.html'),
    ('Plugins', '/Plugins.html'),
)

Choosing themes

By default, pelican uses its own theme for the static site, but theme can be modified. Let’s choose pelican bootstrap3 for our example here:

git clone https://github.com/DandyDev/pelican-bootstrap3.git

Now, add the full path to the theme at the end of the publishconf.py file:

THEME = "/Users/apandey/code/githubs/pelican_coders/all_themes/pelican-bootstrap3"

Finally, build your site again and serve:

fab build
fab serve
Pelican Bootstrap3 theme

Pelican Bootstrap3 theme

Get all this code in github repo

I realize there maybe a few things going on here. You can get this whole setup as a project from my github repo

You will find all this code and setup so that your life is easier. Just start with d3 wiki along with virtual environment and you will be fine.

Docker Container cleanup on Elastic Beanstalk

Sometimes you may notice that old containers are not cleaned up from Beanstalk environment. This may be due to your container still running as a ghost on the background. One way to find out about this is to quickly look into your /var/lib/docker/vfs/dir directory whether it has too many folders.

Next, find out what container processes you have going on. [root@ip dir]# docker ps -a

You might see something like this:

    CONTAINER ID        IMAGE                              COMMAND             CREATED             STATUS              PORTS               NAMES
    1611e5ebe2c0        aws_beanstalk/staging-app:latest   "supervisord -n"    About an hour ago                                           boring_galileo
    e59d0dd8bba1        aws_beanstalk/staging-app:latest   "supervisord -n"    About an hour ago                                           desperate_yalow
    3844d0e18c47        aws_beanstalk/staging-app:latest   "supervisord -n"    2 hours ago         Up 8 minutes        80/tcp              pensive_jang

Ideally, we want to “forcibly remove” all images (and hence the folders from /var/lib/docker/vfs/dir directory) that are not in use anymore. Just run the following to test whether it works:

    docker rmi -f `docker images -aq`

You might run into trouble where docker says that all those images already have a container that is running them. This means those container are orphaned but not killed as we thought them to be. Let’s remove the shared volumes if any, for each one of them.

    docker rm -fv `docker ps -aq` 

This will

  • kill the container
  • unlink the volumes

You should see a lot more space now on your beanstalk instance.

    [root@ip dir]# df -h
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/xvda1      7.8G  1.8G  5.9G  24% /
    devtmpfs        490M   96K  490M   1% /dev
    tmpfs           499M     0  499M   0% /dev/shm

Last Resort

If you feel that all this is not working, then you can try one of the scripts provided by docker itself at GitHub

It will delete the folders under /var/lib/docker and try to do it responsibly.

Partition Linked List around a Value X

How do you partition a list around a value x, such that all nodes less than x come before all nodes greater than or equal to x?

Well, there are some solutions possible. The solution, I came up with, is a bit convoluted but let me tell the idea behind it. You want to track the following:

  • Two pointers to remember the beginning of the lower and higher series each

  • One pointer (current) to iterate through the Linked List

  • The list may itself start with higher or lower value compared to the middleValue. Thus we need to remember the beginning of the lower series (lowerSeries) as this is what we will send back

Now that we have this out of the way, let’s look at the code:

Code

As usual the code is available here:

https://github.com/abhi1010/Algorithms

Find the Kth to Last Element of a Singly Linked List

It is possible to a recursive solutions but I will use a simple runner logic. Recursive solutions are usually less optimal.

Note here that, in our logic K=1 would return the last element in the linked list. Similarly, K=2 would return the second last element.

The suggested solution here is to use two pointers:

  • One pointer will first travel K items into the list
  • Once that is done, both the pointers start travelling together, one item at a time
  • They keep travelling until the end of linked list is found
  • In that situation, the first pointer is at the end of the list, but the second pointer would have only reached till Kth element - this is what you want

Let’s have a look at the code:

As usual the code is available here:

https://github.com/abhi1010/Algorithms

Removing Duplicates from Linked List

Duplicates can be removed in many ways:

  • Create a new Linked List containing only unique items

  • Iterate through the Linked List and keep removing items that are being repeated

The internal structure itself for the algo can either be map or set based. When using map the Node itself can be saved thereby making your life easier if you are creating a new Linked List. However sets can be very useful if we are just iterating through the Linked List and simply deleting items that are being repetetive. This is also a great spacesaver. Hence we decided to go down this path.

Code

As usual the code is available here:

https://github.com/abhi1010/Algorithms

Here’s a small sample as to how to do it:

Deleting a Node from Singly Linked List

Deleting a Node from Singly Linked List is rather straight forward.

  • You have to know the head first of all

  • Start by checking the head if that’s the one you are looking for

  • Keep moving forward and checking - always check for null pointers everywhere

Before we talk about the code, let’s see how Linked List is setup.

Now, below is the code for it….

Code

As usual the code is available here:

https://github.com/abhi1010/Algorithms