We are going to talk about a setup where all you need to do it commit your code and all the rest of the steps from unit tests to deployment can be taken care of by some externally hosted cloud platform that provides continuous integration.
In my case, it is going to be Shippable that I am using as a sample but you can use almost anything like TravisCI or codeship, for example.
Setup
Here is the setup we will be looking at:
Shippable for commits
We will use shippable for the following:
Unit Tests
Regression Tests
Localized DB Tests
Tagging of the source code if the commit passes all tests
Deployment of the source code on beanstalk running docker
Login onto shippable and setup your project to be built. It uses webhooks with your repository like GitHub or shippable which are called on every commit.
You can create a shippable.yml file in your project which will be called on every commit. If you have used docker before, it might look familiar because they invoke a docker script to run the script within shippable.yml
Here is one of my sample files from the project that powers this blog:
If the tests pass in scripts only then does shippable go to after_success section.
Over there, you might want to tag your source code, so that docker will only pull the tagged and approved commits from shippable, not every commit - which is very important.
Once you have approved your code commit, it is time to deploy it to docker on beanstalk.
I like to keep deployment scripts in another bash script, so that deployment can be done in various other ways as well, if needed.
Run a base docker from a custom image - where all apps and project requirements have already been installed and configured. It helps me save a lot of time during deployments.
Download the source code using RUN - which I update using another method.
Expose port 80 so that this docker container can be used as a web container
Set cmd so that it allows supervisord to be used for running the container
Beanstalk Configuration
Once we have the Dockerfile ready, we need to set up the configuration for beanstalk so that during deployment, other steps can be taken care of as well. Some of the things to keep in mind in beanstalk setup are:
Tips
All beanstalk configuration has be kept in a folder called .ebextension
beanstalk ec2 instance maintains a folder internally to run the scripts while setting up docker for you so that the instance can be ready for you
It is totally possible to plug your own scripts into beanstalk initialization setup so that you can program a custom EC2 instance for yourself
Folder to place your scripts are /opt/elasticbeanstalk/hooks/appdeploy/post and /opt/elasticbeanstalk/hooks/appdeploy/pre
Scripts placed in the folders are read in alphabetical order
You can increase the timeout of your docker initialization setup if it takes too long due to multiple steps
Finally, make sure that your folder files are setup as follows:
$ (master) tree -a
.
|-- .ebextensions
||-- 0000_users.config # setup extra users||-- 0002_packages.config # install external packages||-- 0004_bash.config # I like to manage all|-- .elasticbeanstalk
||-- config.yml # AWS connection details|-- .gitignore
|-- Dockerfile # Docker instance|-- Dockerrun.aws.json # folder sharing|-- updater.sh # script to update any code
.elasticbeanstalk folder for aws configs
You might be wondering what’s .elasticbeanstalk folder. It is the folder that’s responsible for taking your AWS secret key and access id for doing the actual deployment. If you don’t set it up, AWS will ask you every time.
For setting it up, you just need to call eb config one time, it creates the folder for you with all the details, including connection details. You can then make it part of your git commits
Make sure it is secure
And that’s it! Once you commit your code, shippable will run the tests, tag your code and finally download this deployment project and deploy it to beanstalk through docker containers.
If I am not wrong, nginx only allows you to upload up till max 2Mb of data by default. If you are doing a docker deployment on beanstalk you may to remember to change that not once but twice!
As you may know already, beanstalk creates an EC2 instance to manage the docker environment.
Since EC2 needs to manage the docker environment and serve the web interface as well, it does so by having another nginx instance to serve the nginx within docker.
Hence, if you had to modify the nginx settings to allow bigger uploads, you’d have to modify the settings for nginx on both - docker as well as EC2.
# max upload sizeclient_max_body_size10M;# adjust to your liking
Also, if you don’t want to have any limit at all for uploads, then just change the client_max_body_size to 0.
While deploying docker multiple times, you may not want to copy over your Django source code every time you do a deployment.
Setting up supervisord
Luckily there is an easy way to manage this. Since you are working with Django, there is a good chance that you are also managing the processes (like uwsgi) with supervisord.
Here are some of the steps that you can take with supervisord
Set up a new process in supervisord
Do not allow it to autorestart since it will be a one-shot process
Call another script in any format to update the source code
As an example, I use bash to update my source code through git
Once you have updated your source code, you need to re-load the project onto uwsgi so that nginx or apache can pick it up.
The simplest way to achieve it using the config parameter of uwsgi: --touch-reload. It will reload uWSGI if the specified file is modified/touched
Just remember to setup supervisord in your Dockerfile with this config parameter.
It may seem like a daunting task to have multiple python projects running in their own virtual environments in docker as you want to manage the running of the tasks from a single source - let’s say supervisord.
However, all that is required here is to know that python automatically picks up the location of the virtual environments if you provide full path to the virtual environment’s python.
For example, in my docker environment, I have virtual environment install at the following location:
/ws/ve_envs/rwv1/
To enable a project with this virtual environment, I can run the following:
I recently wanted to setup a wiki so that I could convert it into a static html site with a proper themes.
What could be a possible use case for such a requirement:
Manage the documentation of a product internally through git but publish it for clients/world through static site
Convert the uncolored wiki to a themed version
Allow serving of the wiki through web application frameworks like Django
It may allow you to have authentication system as a first step hurdle to stop everybody from giving access
Anyways, I went about the whole process and decided to jot down everything. Here I am taking D3 Wiki as an example
which I will be converting into a static site. Let’s begin.
Setup pelican using pelican-quickstart so that all files are setup correctly for creating a static site.
$ pelican-quickstart
Welcome to pelican-quickstart v3.6.3.
This script will help you create a new Pelican-based website.
Please answer the following questions so this script can generate the files
needed by Pelican.
> Where do you want to create your new web site? [.]
> What will be the title of this web site? D3 WIKI
> Who will be the author of this web site? abhi1010
> What will be the default language of this web site? [en]
> Do you want to specify a URL prefix? e.g., http://example.com (Y/n) n
> Do you want to enable article pagination? (Y/n) Y
> How many articles per page do you want? [10]
> What is your time zone? [Europe/Paris] Asia/Singapore
> Do you want to generate a Fabfile/Makefile to automate generation and publishing? (Y/n) Y
> Do you want an auto-reload & simpleHTTP script to assist with theme and site development? (Y/n) Y
> Do you want to upload your website using FTP? (y/N) N
> Do you want to upload your website using SSH? (y/N) N
> Do you want to upload your website using Dropbox? (y/N) N
> Do you want to upload your website using S3? (y/N) N
> Do you want to upload your website using Rackspace Cloud Files? (y/N) N
> Do you want to upload your website using GitHub Pages? (y/N) N
Done. Your new project is available at /Users/apandey/code/githubs/d3wiki
If you tried to simply call pelican command to build the static site, you will notice a lot of errors like:
$ fab build
RROR: Skipping ./请求.md: could not find information about 'NameError: title'
ERROR: Skipping ./过渡.md: could not find information about 'NameError: title'
ERROR: Skipping ./选择器.md: could not find information about 'NameError: title'
ERROR: Skipping ./选择集.md: could not find information about 'NameError: title'
Done: Processed 0 articles, 0 drafts, 0 pages and 0 hidden pages in 3.47 seconds.
The problem is that pelican expects some variables to be defined in each markdown file before it can build the static file.
Some of the variables are:
Title
Slug
Date
You may add your own ones as well that you want.
However, for our initial purposes, we will keep it simple and just try to add these.
Next, how do we achieve this automation?
fab is our answer.
Let’s write a function in python that will modify the markdown files and update them to add Title, Slug, Date
We will edit fabfile.py and add a new function create_wiki:
def create_wiki():
files = []
# Find all markdown files in content folder
for f in os.walk('./content/'):
fpath = lambda x: os.path.join(f[0], x)
for file in f[2]:
fullpath = fpath(file)
# print('f = {}'.format(fullpath))
files.append(fullpath)
filtered = [f for f in files if f.endswith('.md')]
for file in filtered:
with open(file, 'r+') as f:
content = f.read()
f.seek(0, 0)
base = os.path.basename(file).replace('.md', '')
lines = ['Title: {}'.format(base.replace('-', ' ')),
'Slug: {}'.format(base),
'Date: 2015-08-07T14:59:18-04:00',
'', '']
line = '\n'.join(lines)
# Add the lines to the file
f.write(line + '\n' + content)
print(file)
# build and serve the website
build()
serve()
There is a minor issue here though, you will notice that the menu is not available - it is all empty.
It is an easy addition. We will need to add some lines to publishconf.py to say what the menu is gonna be.
For my example, I have chosen to show up the following for D3:
API Reference
Tutorials
Plugins
# We don't want all pages to show up in menuDISPLAY_PAGES_ON_MENU=False# Choose the specific pages that should be part of menuMENUITEMS=(('HOME','/home.html'),('API Reference','/API-Reference.html'),('Tutorials','/Tutorials.html'),('Plugins','/Plugins.html'),)
Choosing themes
By default, pelican uses its own theme for the static site, but theme can be modified.
Let’s choose pelican bootstrap3 for our example here:
Sometimes you may notice that old containers are not cleaned up from Beanstalk environment. This may be due to your container still running as a ghost on the background. One way to find out about this is to quickly look into your
/var/lib/docker/vfs/dir directory whether it has too many folders.
Next, find out what container processes you have going on.
[root@ip dir]# docker ps -a
You might see something like this:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1611e5ebe2c0 aws_beanstalk/staging-app:latest "supervisord -n" About an hour ago boring_galileo
e59d0dd8bba1 aws_beanstalk/staging-app:latest "supervisord -n" About an hour ago desperate_yalow
3844d0e18c47 aws_beanstalk/staging-app:latest "supervisord -n" 2 hours ago Up 8 minutes 80/tcp pensive_jang
Ideally, we want to “forcibly remove” all images (and hence the folders from /var/lib/docker/vfs/dir directory) that are not in use anymore.
Just run the following to test whether it works:
docker rmi -f `docker images -aq`
You might run into trouble where docker says that all those images already have a container that is running them. This means those container are orphaned but not killed as we thought them to be. Let’s remove the shared volumes if any, for each one of them.
docker rm -fv `docker ps -aq`
This will
kill the container
unlink the volumes
You should see a lot more space now on your beanstalk instance.
How do you partition a list around a value x, such that all nodes less than x come before all nodes greater than or equal to x?
Well, there are some solutions possible. The solution, I came up with, is a bit convoluted but let me tell the idea behind it. You want to track the following:
Two pointers to remember the beginning of the lower and higher series each
One pointer (current) to iterate through the Linked List
The list may itself start with higher or lower value compared to the middleValue. Thus we need to remember the beginning of the lower series (lowerSeries) as this is what we will send back
Now that we have this out of the way, let’s look at the code:
It is possible to a recursive solutions but I will use a simple runner logic. Recursive solutions are usually less optimal.
Note here that, in our logic K=1 would return the last element in the linked list. Similarly, K=2 would return the second last element.
The suggested solution here is to use two pointers:
One pointer will first travel K items into the list
Once that is done, both the pointers start travelling together, one item at a time
They keep travelling until the end of linked list is found
In that situation, the first pointer is at the end of the list, but the second pointer would have only reached till Kth element - this is what you want
Create a new Linked List containing only unique items
Iterate through the Linked List and keep removing items that are being repeated
The internal structure itself for the algo can either be map or set based. When using map the Node itself can be saved thereby making your life easier if you are creating a new Linked List. However sets can be very useful if we are just iterating through the Linked List and simply deleting items that are being repetetive. This is also a great spacesaver. Hence we decided to go down this path.