Posts
ckan-docker updated
30 Nov 2014
Checkout the update of ckan-docker on Github
It’s a big update that adds a data container to the project to store CKAN FileStore and Postgres Data & config.
After a few weeks working with this project for a client, I have realised how important that is and that docker cp
and pg_dump
are not quick enough to keep your productivity up.
It’s very easy to use; in fig.yml
, a service called data is built from the data Dockerfile
… and the Postgres & CKAN containers inherites the volumes from the data container /var/lib/ckan
, /etc/postgresql/9.3/main
, /var/lib/postgresql/9.3/main
with the volumes_from
instruction
Of course custom location can be set if you decide to use other locations, but I deliberatelly chose to override standard locations by default, to make it easier to takeon by anyone. The locations are identified by environement variables when the image is created, so you would only have to change their values and the value of volumes to match it.
Check this out on CKAN Github repo
Cheers,
Clément
CKAN Docker Official Repo
26 Oct 2014
A few days ago I submitted a pull request to share my work on Docker for CKAN with the rest of the community.
But it’s a little too big :) … 37 files changed with 1,266 additions and 115 deletions.
So we’ve decided to move the Docker, Fig & Vagrant stuff out of the main CKAN repo, which makes a lot of sense.
In the meantime I’ve officially become a member of the CKAN organisation on Github which is pretty cool!
Extracting the Docker stuff out got me thinking a lot, and I re-factored even more stuff.
I’ve come up with a radically different tree structure:
The point of this structure if that it should be easy to manage your entire project. This should be able to wrap everything you need in there, and package it.
I’ve also consolidated the various Dockerfiles I wrote for CKAN (default, custom & dev) into one :)
- all processes are managed by supervisor, which makes it easier to shut-down Apache in a development context, and use paster instead.
- Nginx now gone from the CKAN Dockerfile, and this service is now handled by another container as it should
- The requirement of being able to make live edits on the code for development is covered by mounting a volume as source directory, which overrides the data that was initially copied in the container. Pip requirements remain and I just have to re-install the packages automatically as part of the init process.
- I’ve added a lot of
ONBUILD
triggers that allow building children images for dev & prod, which covers what I was initially doing with my custom Dockerfile. -I’ve also extracted the datapusher from the CKAN config, into a separate container.
And there is more…
Check this out on CKAN Github repo
Cheers,
Clément
CKAN Development & Deployment using Docker, Fig & Vagrant
19 Oct 2014
Intro
When I discovered CKAN over a year ago it was version 1.7 or 8 and the implementation I studied was using Elastic Search for indexing… I was really confused by the complexity of the setup and it took me a few attempts to get my “play box” right.
By the time I started working on a project based on CKAN 2.2. the documentation had improved tremendously, and projects such as Data.gov.uk To Go as well as CKAN Packaging Scripts have really improved the way you can package & deploy the many components of a typical CKAN portal.
A few months ago I gave a short talk on CKAN at a JBug Scotland event and I was really amazed by Ian Lawson presentation on OpenShift. A few months after that I discovered Docker. And also found out that OpenShift was going to support Docker
I thought this was a really good news, because that meant that if I “containerised” my CKAN install I could use the same containers in every environments I’m working on, Dev, Test, Staging, Prod, Cloud!… and I wasn’t the only one thinking that way, in May Nick Stenning make a great Pull Request with the first containers for CKAN
There were a few issues though, such as the absence of datastore, inability to setup the ckanext-spatial because PostGIS was not installed, editing the config was complex and not very flexible, and the three containers were using different bases which meant that you were pulling three bases images instead of caching one.
Standing on the shoulders of giants
I picked up from there as re-factored the containers one by one, starting with Postgres, then Solr & CKAN. When that was done I created another Dockerfile that extends the main CKAN Dockerfile to allow custom configurations based on the core project.
-
All containers extend the same base image
phusion/baseimage
(updated to 0.9.13), which means you only pull & cache them once, the first few steps are also identical to rely on Docker cache as much as possible -
The Postgres container installs PostGIS, configures the database, the datastore & PostGIS on the CKAN database. The default names & passwords can easily be overridden with environment variables.
-
The Solr container has been updated to 4.10.1
-
The CKAN Core container has been updated to configure the datapusher, has all the dependencies required to use the spatial extension & also supervisor to manage tasks.
-
The custom config shows how to extend the Core container to enable common extensions such as ckanext-viewhelpers, ckanext-archiver, ckanext-spatial, ckanext-harvest, and how you can extract services such as redis from the CKAN container and let that service be handled by a separate container.
Docker
Building containers is easy, caching is powerful. But you need to cheat sometimes, especially with the ADD
command.
In the Solr container for instance, I quickly realised that the following command:
is not cached, whereas
is.
And since Solr tar is over a 100Mb, so installing wget & cheating is really worth it!
In some cases like that RUN
is more appropriate than ADD
, but it really depends on the use case.
Managing containers can be tedious, especially when you’re developing them. There are a lot of tools to help. I’ve not tried Shipyard yet but I will soon. In the meantime docker-cleanup is pretty useful, and the usual docker stop $(docker ps -aq)
& docker rm $(docker ps -aq)
work great to clean-up any running containers
But when I’m working with a custom Docker container I have to type (or copy & paste) 4 commands to build them, 4 commands to run them… and just as many to stop the containers
This is a bit tedious, and that’s why I looked at Fig
Fig
Fig allows you to define all the above in a single YAML file to do the following:
- start, stop and rebuild services
- view the status of running services
- tail running services’ log output
- run a one-off command on a service
so the 8+ commands above are reduced to 1: fig up
thanks to the definition below:
And fig can simplify the rest of the docker commands you want to run, to view logs etc.
Vagrant
Now you may wonder why do you need/want Vagrant? The whole point about Docker is that containers are not VMs, and Fig has reduced the complexity of managing containers, why would you want to bring virtualisation back in the picture?
Well the answer is simple: portability. I have a personal Mac, a work PC, and Linux servers… Docker will work on all those operating systems; natively on Linux and through proxy a VM on OS X & Windows: Boot2docker. I love this project, it’s fast, lightweight & simple to use, but it doesn’t support volumes & shared folders on Windows yet (Boot2docker 1.3 offers partial support on Mac OS X), and it’s not really representing your production host.
That’s why I think Vagrant is useful, and I was really excited to see support for Docker added in Vagrant 1.6
My goal was you make sure than any development environment would represent production and behave exactly the same. This also helps portability of the environment, since a simple command: vagrant up --provider=docker --no-parallel
will create Linux hosts running Docker if required (OSX & Windows), build & run boot all the containers in order & mount the source directory on your machine as a volume inside the container.
The development Dockerfile is slightly different & designed to be lightweight, Apache & Nginx are not installed. paster serve
does just what you want on a dev box. vagrant ssh
also works a treat with Phusion baseimage
and you can ssh directly into the container.
Wrap up
That was a great personal journey into containerisation & virtualisation to build consistent & portable development environments. There’s still to be done on the core Dockerfile to extract Nginx from the main container & link the official Ngnix container instead. The Example Vagrant file is really just a template to show what’s possible but at the moment it only maps the CKAN source directory, so you would have to add new synced folders to build your custom extensions. It’s just one step further, and hopefully it’s just a start.
Next
Check this out on my Github repo
Clément
References
some really good reading
- Vagrant with Docker: How to set up Postgres, Elasticsearch and Redis on Mac OS X – maori.geek
- Vagrant 1.6 Feature Preview: Docker-Based Development Environments - Vagrant
- Building a Development Environment with Docker - Terse Systems
- A Rails Development Environment with Docker and Vagrant
- VirtualBox guest-specific operations error · Issue #81 · tmatilai/vagrant-proxyconf
- Setting up a development environment using Docker and Vagrant - Zenika
- vagrant-cachier :: viewdocs.io
- Docker in OSX via boot2docker or Vagrant: getting over the hump
- Rails Development Using Docker and Vagrant - Abe Voelker
- Vagrant Synced Folders Permissions - jeremykendall.net
- Dockerfile.tmpl
- Get Started with Docker Containers in RHEL 7 - Red Hat Customer Portal
- Docker - OpenStack
- bnchdrff/dockerfiles
- Docker Images / Demo CKAN
- Allow customised CKAN Docker images (fixes #1904) by cygri · Pull Request #1929 · ckan/ckan · GitHub
- Quickly SSH into a Docker container
- harbur/docker-workshop
- How to Use Docker on OS X: The Missing Guide
- Use Docker to Build a LEMP Stack (Buildfile)
Working with Jekyll
11 Oct 2014
I’ve spent a few hours playing with this blog and jekyll itself. It’s pretty cool. I really like the flexibility of the platform, and the fact you can pretty much throw any markup at it… I have blog pages like this in Markdown, and other pages in HTML, you can just pick and choose the mardown format as well.
It’s also my first time playing with SASS. I’ve used LESS in the past, when developing custom templates & themes for CKAN, and to be fair I can’t see much difference between the two, both work pretty well and do what you expect; functions, variables etc.
Jekyll is really a useful tool if you know what you want, because you’ll have the freedom to do whatever you want. The drawback is that you have to do a lot more that you have to with a CMS such as Drupal or Wordpress… Mostly because you don’t rely on a database or index to do search & faceting. I’ll have a look at plugins & modules to cover that at some point.
Anyway the fun is there for sure. I’ll be posting soon about Docker & CKAN containers.
Clément
Say Hi to Jekyll!
10 Oct 2014
Say hi to http://clementmouchet.github.io
I’ve just created this website/blog using Jekyll, JQuery & Bootstrap, loved the simplicity and the free hosting on Github!
Jekyll is really simple, and easy to manage, no databses, just markdown pages, like the one you’re reading now, see below:
Visit the jekyll website to find out more
I’ll add a few posts as soon as possible with some stuff regardin Docker
Cheers,
Clément