Installation¶
MegaQC has been written in Python using the Flask web framework. MegaQC is designed to be very simple to get up and running for basic testing and evaluation, yet super easy to configure for a high performance production installation. The various ways of getting a runnable MegaQC instance are explained in the following sections.
Production¶
This section explains how to set up a production environment without the usage of container technologies. If you want to run MegaQC in a containerized environment please refer to Docker.
1. Install the MegaQC package¶
MegaQC is available on both the Python Package Index (PyPI). We are planning to add MegaQC to Conda soon. To install using PyPI, run the following command:
pip install megaqc[prod]
2. Export environment variables¶
By default, MegaQC runs in development mode with a sqlite flat file
database (this is to make it as simple as possible to get up and running
for a quick test / demo). To tell MegaQC to use a production server, you
need to set the MEGAQC_PRODUCTION
environment variable to true
(export MEGAQC_PRODUCTION=1
).
If you are running MegaQC behind a custom domain name (recommended, it’s
nicer than just having a difficult to remember IP address), then you
need to set SERVER_NAME
to the URL of the website.
Add the following lines to your .bashrc
file:
export MEGAQC_PRODUCTION=1
export SERVER_NAME='http://megaqc.yourdomain.com'
3. Set up the database¶
MegaQC uses the Flask SQLAlchemy plugin, meaning that it can be used with any SQL database (PostgreSQL, MySQL, SQLite and others).
MegaQC has been developed with PostgreSQL, see below. For instructions. If you use MegaQC with any other database tools and could contribute to the documentation, that would be great!
3.1 Using a PostgreSQL database¶
First, install PostgreSQL: https://wiki.postgresql.org/wiki/Detailed_installation_guides
Then, install the Python package that handles requests:
pip install psycopg2
MegaQC can assess whether the database to use is postgresql
. If it
is, it will try to connect as megaqc_user
to the database megaqc
on localhost:5432
. On failure, MegaQC will attempt to create the
user and the database, and will then export the schema.
In order to make this happen, run :
megaqc initdb
3.2 Using a MySQL database¶
Although PostgreSQL is highly recommended, MegaQC should work with other SQL database back ends, such as MySQL.
Please note that MySQL support is currently untested and unsupported. If you use MegaQC with MySQL we’d love to hear about your experiences!
First, install MySQL: https://dev.mysql.com/doc/refman/5.7/en/installing.html
Then install the Python MySQL connector (alternatively with the PyPI package).
Now, create a custom MegaQC configuration file somewhere and set the
environment variable MEGAQC_CONFIG
to point to it. For example, in ~/.bashrc
:
export MEGAQC_CONFIG="/path/to/megaqc_config.yaml"
Then in this file, set the following configuration key pair:
SQLALCHEMY_DBMS: mysql
This should, hopefully, make everything work. If you have problems, please create an issue and we’ll do our best to help.
4. (Optional, but recommended) Create a reverse proxy¶
Apache¶
Note:You can skip this step if you wish to use gunicorn as your primary web server, but it’s not recommended.
Note:This is an example configuration that will map all http requests to the current server to MegaQC. It will also not filter anything. Please consider your server security!
Update your apache configuration
(/usr/local/apache2/conf/httpd.conf
, /etc/apache2/apache2.conf
,
/etc/httpd/conf/httpd.conf
…) to include, for example (Apache 2.2):
<VirtualHost *:80>
SetEnv proxy-sendcl 1
ProxyPass / http://127.0.0.1:8000/
ProxyPassReverse / http://127.0.0.1:8000/
<Proxy *>
Order Allow,Deny
Allow from all
</Proxy>
</VirtualHost>
You also need to ensure that apache mod_proxy is activated :
`a2enmod proxy a2enmod proxy_http`
In order for these changes to be applied, you need to restart apache with the following command (or equivalent on your system):
service restart httpd
NGINX¶
An example NGINX configuration is provided in the deployment folder. Please note that it is designed to work with the Docker Compose stack. For more details please refer to The MegaQC Docker Compose stack.
5. Start the web server¶
gunicorn --log-file megaqc.log --timeout 300 megaqc.wsgi:app
Note:We recommend using a long timeout as the data upload from MultiQC can take several minutes for large reports
At this point, MegaQC should be running on the default gunicorn port (8000
)
You should now have a fully functional MegaQC server running! 🎉
Troubleshooting¶
The password encryption relies on the libffi-devel
package to work.
If you run an older OS, ensure that the package is installed.
Docker¶
MegaQC offers two ways of getting a containerized setup running:
A single Docker container containing MegaQC with a Gunicorn WSGI HTTP server
A Docker Compose stack containing the MegaQC container, a Postgres container and a NGINX container
The MegaQC Docker container¶
Overview¶
The MegaQC container is based on the Node container to compile all Javascript scripts and the Gunicorn Flask container providing Gunicorn, Flask and MegaQC preconfigured for production deployments. The Gunicorn Flask container is also the one spinning up the final server.
Pulling the docker image from dockerhub¶
To run MegaQC with docker, simply use the following command:
docker run -p 80:80 ewels/megaqc
This will pull the latest image from dockerhub and run MegaQC on port 80.
Note that you will need to publish the port in order to access it from the host, or other machines. For more information, read https://docs.docker.com/engine/reference/run/ .
Building your own docker image¶
If you prefer, you can build your own docker image if you have pulled the MegaQC code from GitHub. Simply cd to the MegaQC root directory and run
docker build . -t ewels/megaqc
You can then run MegaQC as described above:
docker run -p 80:80 ewels/megaqc
Configuration¶
Besides the sections below it is also recommended to read the
Gunicorn Flask container documentation,
which explains how to customize the host
IP where Gunicorn listens
to requests, the port
the container should listen on and bind
, the actual
host and port passed to gunicorn, let alone custom Gunicorn configuration files.
Environment variables¶
By default, the MegaQC related environment variables are set to:
MEGAQC_PRODUCTION=1
MEGAQC_SECRET="SuperSecretValueYouShouldReallyChange"
MEGAQC_CONFIG=""
APP_MODULE=megaqc.wsgi:app
DB_HOST="127.0.0.1"
DB_PORT="5432"
DB_NAME="megaqc"
DB_USER="megaqc"
DB_PASS="megaqcpswd"
To run MegaQC with custom environment variables use the -e key=value
run options.
For more information, please read
Docker - setting environment variables.
Running MegaQC for example with a custom database password works as follows:
docker run -e DB_PASS=someotherpassword ewels/megaqc
Furthermore, be aware that the default latest tag will typically be a development version and may not be very stable. You can specify a tagged version to run a release instead:
docker run -p 80:80 ewels/megaqc:v0.1
Also note that docker will use a local version of the image if it exists. To pull the latest version of MegaQC use the following command:
docker pull ewels/megaqc
Using persistent data¶
The Dockerfile has been configured to automatically create persistent volumes for the data and log directories. This volume will be created without additional input by the user, but if you want to re-use those volumes with a new container you must specify them when running the docker image.
The easiest way to ensure the database persists between container states
is to always specify the same volume for /usr/local/lib/postgresql
.
If a volume is found with that name it is used, otherwise it creates a
new volume.
To create or re-use a docker volume named pg_data
:
docker run -p 80:80 -v pg_data:/usr/local/lib/postgresql ewels/megaqc
The same can be done for a log directory volume called pg_logs
docker run -p 80:80 -v pg_data:/usr/local/lib/postgresql -v pg_logs:/var/log/postgresql ewels/megaqc
If you did not specify a volume name, docker will have given it a long
hex string as a unique name. If you do not use volumes frequently, you
can check the output from docker volume ls
and
docker volume inspect $VOLUME_NAME
. However, the easiest way is to
inspect the docker container.
# ugly default docker output
docker inspect --format '{{json .Mounts}}' example_container
# use jq for pretty formatting
docker inspect --format '{{json .Mounts}}' example_container | jq
# or use python for pretty formatting
docker inspect --format '{{json .Mounts}}' example_container | python -m json.tool
Example output for the above, nicely formatted:
[
{
"Type": "volume",
"Name": "7c8c9dfbcc66874b472676659dde6a5c8e15dea756a620435c83f5980c21d804",
"Source": "/var/lib/docker/volumes/7c8c9dfbcc66874b472676659dde6a5c8e15dea756a620435c83f5980c21d804/_data",
"Destination": "/usr/local/lib/postgresql",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
},
{
"Type": "volume",
"Name": "6d48d24a660d078dfe4c04960aeb1848ea688a3eae0d4b7b54b1043f7885e428",
"Source": "/var/lib/docker/volumes/6d48d24a660d078dfe4c04960aeb1848ea688a3eae0d4b7b54b1043f7885e428/_data",
"Destination": "/var/log/postgresql",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
}
]
Running MegaQC with a local Postgres database¶
To access a Postgres database running on a localhost you need to use the host’s networking. For more information, read https://docs.docker.com/network/host/ .
An example command to run MegaQC with a Postgres database which is accessible
on localhost:5432
, looks as follows:
docker run --network="host" -p 5432 ewels/megaqc
Note that by default localhost=127.0.0.1
.
The MegaQC Docker Compose stack¶
Since a fully working and performant MegaQC instance depends on a SQL database and a reverse proxy, MegaQC offers a docker-compose stack, which sets up three containers for a zero configuration setup.
Overview¶
The docker-compose configuration can be accessed in the deployment folder. The docker-compose configuration provides the The MegaQC Docker container, a postgres container for the SQL database and a nginx container for the reverse proxy setup.
Usage¶
Inside the deployment folder the docker-compose configuration together with the associated .env file are found. To spin up all containers simply run from inside the deployment folder:
docker-compose up
All containers should now spin up and the MegaQC server should be accessible on 0.0.0.0:80
.
Alternatively, you can spin up the containers in the background:
docker-compose up -d
The -d
option detaches from the containers, but will keep them running.
Configuration¶
Environment variables¶
The default environment variables for MegaQC used when starting the The MegaQC Docker container are defined inside the .env file. Simply edit the file and the new environment variables will be passed to the The MegaQC Docker container.
Further runtime arguments¶
Further runtime arguments can be added to a command section inside the docker-compose configuration file.
HTTPS¶
By default, the MegaQC stack ships with a self-signed SSL certificate for testing purposes.
For this reason we recommend that you use HTTP to access the stack.
However, if you want to enable HTTPS, perhaps because you are making MegaQC available on the public internet, then it should be simple to install your own certificates.
To do so, go to the deployment
directory and edit the .env
file.
Then, edit these lines to the full filepath of the respective .crt
and .key
files:
CRT_PATH=./nginx-selfsigned.crt
KEY_PATH=./nginx-selfsigned.key
After this, run the stack as described above, and then you should be able to access MegaQC on https://your_hostname
.
Development¶
1. Clone the repo¶
If you’re doing development work, you need access to the source code
git clone https://github.com/ewels/MegaQC
2. Install Dependencies¶
You should install MegaQC using Poetry. You also need to install MegaQC and all its dependencies there:
cd MegaQC
poetry install
4. Enable development mode:¶
Setting this bash variable runs MegaQC in development mode. This means that it will show full Python exception tracebacks in the web browser as well as additional Flask plugins which help with debugging and performance testing.
export FLASK_DEBUG=1
5. Set up the database¶
Running this command creates an empty SQLite MegaQC database file in the
installation directory called megaqc.db
megaqc initdb
6. Start megaqc¶
Start MegaQC.
megaqc run
You will have to run the rest of these commands in another terminal
window, because megaqc run
blocks the terminal.
7. Setup your access key¶
Login to MegaQC in your browser by browsing to http://localhost:5000/register/ (the port might differ, it will depend on what was output in the
megaqc run
stage previouslyOnce registered, visit http://localhost:5000/users/multiqc_config and follow the instructions there to configure your access token in
~/.multiqc_config.yaml
.Note: if you you’d rather not pollute your home directory, you can instead name the file
multiqc_config.yaml
and place it in the current (MegaQC) directory. However, you will then have to runmegaqc upload
from that directory each time
8. Load test data¶
In order to develop new features you need some data to test it with:
git clone https://github.com/TMiguelT/1000gFastqc
for report in $(find 1000gFastqc -name '*.json')
do megaqc upload $report
done
9. Install the JavaScript and start compiling¶
This command will run until you cancel it, but will ensure that any changes to the JavaScript are compiled instantly:
npm install
npm run watch
10. Install the pre-commit hooks¶
MegaQC has a number of pre-commit hooks installed, which automatically format and check your code before you commit. To set it up, run:
pre-commit install
From now on, whenever you commit, each changed file will get processed
by the pre-commit hooks. If a file is changed by this process (because
your code style didn’t match the configuration), you’ll have to
git add
the files again, and then re-run git commit
.
If it lets you write a commit message then everything has succeeded.
Next Steps¶
You should now have a fully functional MegaQC test server running, accessible on your localhost at http://127.0.0.1:5000
Migrations¶
Introduction¶
Migrations are updates to a database schema. This is relevant if, for
example, you set up a MegaQC database (using initdb
), and then a new
version of MegaQC is released that needs new tables or columns.
When to migrate¶
Every time a new version of MegaQC is released, you should ensure your
database is up to date. You don’t need to run the migrations the first
time you install MegaQC, because the megaqc initdb
command replaces
the need for migrations.
How to migrate¶
To migrate, run the following commands:
cd megaqc
export FLASK_APP=wsgi.py
flask db upgrade
Note: when you run these migrations, you must have the same
environment as you use to run MegaQC normally, which means the same
value of FLASK_DEBUG
and MEGAQC_PRODUCTION
environment
variables. Otherwise it will migrate the wrong database
(or a non-existing one).
Stamping your database¶
The complete migration history has only recently been added. This means that, if you were using MegaQC in the past when migrations were not included in the repo, your database won’t know what version you’re currently at.
To fix this, first you need to work out which migration your database is
up to. Browse through the files in megaqc/migrations/versions
,
starting from the oldest date (at the top of each file), until you find
a change that wasn’t present in your database. At this point, note the
revision
value at the top of the file, (e.g. revision = "007c354223ec"
).
Next, run the following command, replacing <revision ID>
with the
revision you noted above:
flask db stamp <revision ID>