GITenberg is a project to collectively curate ebooks on GitHub. Gitberg is a command line tool to automate tasks on books stored in git repositories. Gitberg-autoupdate is a set of automated tools to continuously update GITenberg.
Some commands require a config file before they can be used. See gitberg's documentation for details.
To run project in development mode clone the project and do:
python setup.py develop
The following environment variables must be set to the appropriate values:
GITHUB_WEBHOOK_SECRET
AWS_DEFAULT_REGION
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
To run project tests do:
python setup.py test
Both servers run in an AWS Elastic Beanstalk (EBS) application. webhook_server
runs as a "Web server environment", and autoupdate_worker
runs as a
"Worker environment".
Configure the following environment variables (under Configuration > Software > Environment properties) for webhook_server
environments:
GITHUB_WEBHOOK_SECRET
AWS_DEFAULT_REGION
GITENBERG_SECRET
GITBERG_GH_USER
(deprecated)GITBERG_GH_PASSWORD
(deprecated)GITBERG_GH_ACCESS_TOKEN
(this is a 'user access token')SSH_KEY_PASSWORD
(this is the password todeploy/autoupdate_worker/id_ed25519_password
)- Note that you must set these all when first creating the environment. Otherwise, it won't start and is then apparently unrecoverable.
- Don't set
AWS_ACCESS_KEY_ID
orAWS_SECRET_ACCESS_KEY
. The EBS instance role will be picked up automatically instead. QTWEBENGINE_CHROMIUM_FLAGS="--no-sandbox"
(needed to let calibre make PDF with QT and Chromium)
The EBS environment must be configured as follows for webhook_server
environments:
- Virtual machine instance profile: elasticbeanstalk-ec2-autoupdate
- Health check path (under Monitoring):
/health
- Environment type: load balancing
- Load balancer: add a HTTPS listener on 443 which delivers to HTTP on 80, using an appropriate SSL certificate
Configure the following environment variables (under Configuration > Software > Environment properties) for autoupdate_worker
environments:
GITENBERG_SECRET
The EBS environment must be configured as follows for autoupdate_worker
environments:
- Virtual machine instance profile: elasticbeanstalk-ec2-autoupdate
- Health check path (under Monitoring):
/health
- Environment type: load balancing
- HTTP path (under Worker):
/do_update
- Worker queue (under Worker): gitberg-autoupdate-repositories
- HTTP connections (under Worker): 5
- Visibility timeout (under Worker): 600
Create the deployment zips using
python deploy/make_deploy.py webhook_server
and
python deploy/make_deploy.py autoupdate_worker
then upload to EB application versions. Make sure to commit your changes first.
To run one of the servers locally in a Docker container identical to the one which will be used by EBS, use these commands:
$ docker build -f deploy/webhook_server/Dockerfile -t webhook_server ./ && docker run -i --env-file test_env -p 127.0.0.1:1234:80 webhook_server
$ docker build -f deploy/autoupdate_worker/Dockerfile -t autoupdate_worker ./ && docker run -i --env-file test_env -p 0.0.0.0:1235:1235 -v /tmp:/tmp autoupdate_worker
These commands rely on a file called test_env
with the environment variables
specified above under Development to be configured.
the docker app will think it's not configured if the rdf.tar.bz2 file is less than 24 hours old
You can use ngrok https://ngrok.com/ to send a github webhook to the auto-update server - it will put events on the configured queuing service.
To test the local autoupdate worker, just send it a post with curl, for example
to redo a file of repoversions:
python -c "from gitenberg_autoupdate.queue import queue_from_file; queue_from_file('/Documents/gitenberg/redo.txt')"