-
Notifications
You must be signed in to change notification settings - Fork 42
2. Setup
This section contains information related to hosting, setting up and maintaining your LinguaCafe server. There are some important steps to take after installation before you can use linguacafe, like installing additional languages and importing dictionaries.
Important
On MacOS you might need actual Docker Desktop instead of just basic Docker, because it allows you to use Rosetta to run images without support for Arm64 like our Python image, which uses Spacy models that only work in Amd64.
Create a folder for linguacafe, and a storage subfolder. Then download the docker-compose.yml file, and place in inside your linguacafe folder. Your folder structure should look like this:
.
├── linguacafe
│ ├── storage
│ ├── docker-compose.yml
If you want to change the default MySQL database and user, you can create a .env
file inside your linguacafe folder and add these lines to it before starting your servers for the first time:
DB_DATABASE="linguacafe"
DB_USERNAME="linguacafe"
DB_PASSWORD="linguacafe"
You can also use a remote MySql server. In this case, you must create the database itself before starting the server.
DB_HOST="linguacafe-database-host"
DB_PORT=3306
MacOs users with Apple silicon must also create a .env
file, and add the following line:
PLATFORM="linux/amd64"
docker compose up -d
Windows:
For Windows, you can download this installation script and run it instead of running any of the commands yourself. Since this is a .bat file, Windows defender will warn you about it being potentially a malware.
Your server now should be running and accessible on http://localhost:9191.
Although your server is set up and functional, please read the user manual, because there are a few additional steps before you can use linguacafe, like installing languages and importing dictionaries.
Mysql error while running the `docker compose up -d` command.
Some Apple silicon users have encountered error messages like these:
[+] Pulling 1/3 on
✘ mysql Error context canceled 1.0s
⠏ webserver [⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀] Pulling 1.0s
⠏ python Pulling 1.0s
no matching manifest for linux/arm64/v8 in the manifest list entries
We do not know why, but pulling the images individually fixes this error.
Run these commands, then run docker compose up -d
again:
docker pull --platform linux/arm64 ghcr.io/simjanos-dev/linguacafe-webserver:latest
docker pull --platform linux/amd64 ghcr.io/simjanos-dev/linguacafe-python-service:latest
Please backup linguacafe before updating, otherwise you can lose your data if anything goes wrong. You can read more about backups in the user manual.
If you are below v0.12, please use the migration guide provided here instead of this command.
Download the latest docker-compose.yml file, and overwrite the old one.
Run these commands to update and start your server:
docker compose pull
docker compose up -d --force-recreate
On Windows, you can run again the installation script to update to the latest version, or run the commands separately.
If you run into any problem updating, please contact me on GitHub or Discord, I will try to help.
Note
Please only participate in beta if you can set up LinguaCafe yourself, you can create a backup of your database, and if you don't mind encountering more issues than main version releases, including bugs that can corrupt your database.
First, please create a backup of your old version, and keep it until after the next main version is released. This is very important, because you cannot downgrade your database to an older version.
To use the beta version of LinguaCafe, first check the beta release's description, and follow any extra necessary steps from there. I'll put every important information and breaking changes there. Then create a .env
file in your LinguaCafe directory if you don't already have one, and add this line to the end of it: VERSION=beta
. After that run the update command of your operating system from the readme file/user manual, it will pull and start the latest beta docker image.
When a new main version is released, you can update to it from your beta version by removing the VERSION=beta
text from your .env
file, and following the regular update steps.
I created this docker image because I've seen people using unsupported features, and wanting to use the latest version as soon as possible. You can see new beta releases on the github releases page.
Note
This guide assumes you named your directory linguacafe
during installation. If you used a different name for your directory, simply replace linguacafe
with it.
LinguaCafe stores your data in two directories:
-
linguacafe/storage
directory, which stores your files. -
linguacafe/database
directory, which stores your database files.
Both must be saved to preserve all your LinguaCafe data.
To make a backup of your LinguaCafe instance, simply copy your whole linguacafe
directory. On Linux you may need root privileges to copy the database
folder, so please make sure that it was successful. Also make sure that the permissions are the same after restoring your data. You can reapply them by using the chmod
command from the installation guide.
To ensure that your installed language models works, you must restart your docker container after restoring a backup.
Note
Backup your database regularly! I highly recommend making regular backups, especially before upgrading LinguaCafe to a newer version. LinguaCafe is still in active development, and there is a high possibility of introducing a data corrupting bug.
With the default settings LinguaCafe will create an automatic backup of your database every day at 23:59, and delete the oldest backup if you have more than 14. You can customize these values in the docker-compose.yml
file, using cron syntax.
Although copying the whole database folder works, you might also want to make a raw export of your database in order to remove the dependency on a functioning MySql docker container. This way you can have your database data in a single .sql
file, e.g., linguacafe-backup.sql
.
Note
If you run docker ps -a
, then you should get all running Docker containers, among which there's linguacafe-webserver
or a similarly named container, in which the webserver is running.
Run this command while your LinguaCafe server is running to export your database:
docker exec -ti WEBSERVER-CONTAINER php artisan app:create-backup
where WEBSERVER-CONTAINER
should be replaced with the name you used during installation. If you kept the default names, then the command is simply:
docker exec -ti linguacafe-webserver php artisan app:create-backup
You can find the created backup in your linguacafe/storage/backup
folder.
You can import the database back with the following command:
docker exec -i DATABASE-CONTAINER mysql -uUSERNAME -pPASSWORD DATABASE < FILENAME`
where DATABASE-CONTAINER
, USERNAME
, PASSWORD
and FILENAME
, should be replaced with the names you used during installation. For example:
docker exec -i linguacafe-database mysql -ulinguacafe -plinguacafe linguacafe < ./storage/backup/linguacafe_2024_09_22_18_10_02.sql`
When a new version of LinguaCafe is released, please create a backup, and read the GitHub Release notes and the main GitHub Readme file's update section before updating. If there is an important or a breaking change in the update, it will be noted in those places.
Caution
LinguaCafe is still in active development, and it will change from month to month. Please make sure you backup your data regularly, and expect updates to have possible problems.
LinguaCafe has added support for multiple users recently, however some features are not yet supported for a multi user setup. One of them is Anki. Highlighted words are being sent to Anki through the LinguaCafe server, and this setup does not make sense for multiple users. This will be changed in a future update, so multiple users can send their own cards to their own Anki software.
User deletion is a missing feature currently.
LinguaCafe has some settings (mostly display related), which are stored locally in the browser. These settings are shared between multiple users, if they use the same device to access LinguaCafe.
These limitations will be fixed in a future update.
In LinguaCafe all the data are separated by the selected language. This means that any action you take in one language will not affect the data in other languages, so the first thing you should do in LinguaCafe is select your target language. You can change your selected language by clicking on the flag in the bottom left corner.
When you import text, LinguaCafe does:
-
Lemma generation: When you import a text into LinguaCafe, the text processor will automatically assign dictionary form to words for supported languages. For example, it will assign the lemma
to work
to words such asworked
. - Gender tagging: In gendered and supported languages, LinguaCafe will prepend nouns with additional information based on the words' gender.
Some languages are not packaged in the docker image. These languages can be installed on the Admin > Languages page. Installing a language can take several minutes, and requires internet connection. Installed languages are being saved into the storage directory.
Uninstalling languages are only possible by uninstalling all the installed languages.
LinguaCafe supports the following languages:
Flag | Language | DeepL | Lemma generation | Gender tagging | Dictionaries |
---|---|---|---|---|---|
![]() |
Chinese | ✓ | wiktionary, cc-cedict | ||
![]() |
Croatian | ✓ | dict cc | ||
![]() |
Czech | ✓ | wiktionary, dict cc | ||
![]() |
Danish | ✓ | ✓ | wiktionary, dict cc | |
![]() |
Dutch | ✓ | ✓ | dict cc | |
![]() |
English | ✓ | ✓ | dict cc | |
![]() |
Finnish | ✓ | inaccurate | wiktionary, dict cc | |
![]() |
French | ✓ | ✓ | wiktionary, dict cc | |
![]() |
German | ✓ | ✓ | ✓ | wiktionary, dict cc |
![]() |
Greek | ✓ | ✓ | wiktionary, dict cc | |
![]() |
Italian | ✓ | ✓ | wiktionary, dict cc | |
![]() |
Japanese | ✓ | ✓ | jmdict, wiktionary | |
![]() |
Korean | ✓ | ✓ | wiktionary, kengdic | |
![]() |
Latin | wiktionary | |||
![]() |
Macedonian | ✓ | wiktionary | ||
![]() |
Norwegian | ✓ | ✓ | ✓ | wiktionary, dict cc |
![]() |
Polish | ✓ | ✓ | wiktionary, dict cc | |
![]() |
Portuguese | ✓ | ✓ | wiktionary, dict cc | |
![]() |
Romanian | ✓ | ✓ | wiktionary, dict cc | |
![]() |
Russian | ✓ | ✓ | wiktionary, dict cc | |
![]() |
Slovenian | ✓ | ✓ | wiktionary | |
![]() |
Spanish | ✓ | ✓ | wiktionary, dict cc | |
![]() |
Swedish | ✓ | ✓ | dict cc | |
![]() |
Thai | wiktionary | |||
![]() |
Turkish | ✓ | ✓ | wiktionary, dict cc | |
![]() |
Ukrainian | ✓ | wiktionary | ||
![]() |
Welsh | wiktionary, eurfa |
Note
For Chinese only Mandarin language is supported with simplified Chinese characters.
- Download the dictionaries that you want to use from the provided links below.
- Go to the Admin > Dictionaries page in LinguaCafe, and click on the Add dictionary button.
- Select the Supported dictionary file from the user manual option, then upload the downloaded file.
- Check if the detected dictionary's data is correct, then click on the Import button.
After the import process is finished, your dictionary should be available whenever you select a word while reading.
Caution
Do not rename any dictionary files. For some dictionaries the filename is used to identify them.
Dictionary | Languages | Download | Comment |
---|---|---|---|
JMDict | Japanese | GitHub release | This dictionary contains kanji and radicals for the Japanese language. Some Japanese features do not work without importing this dictionary. |
CC-CEDICT | Chinese | GitHub release | |
Kengdic | Korean | GitHub release | |
Eurfa | Welsh | GitHub release | |
Wiktionary | Chinese, Czech, Finnish, French, German, Italian, Japanese, Korean, Norwegian, Russian, Spanish, Ukrainian, Welsh | GitHub release | |
Dict.cc | Czech, Dutch, Finnish, French, German, Italian, Norwegian, Russian, Spanish, Swedish | dict.cc |
Note
To import JMDict you must download all 4 of these files: jmdict_processed.txt
, kanjidic2.xml
, radical-strokes.txt
, radicals.txt
You can also import a custom dictionary file in the form of a .csv
file.
DeepL is a machine translation service that lets you translate up to 500.000 characters/month for free and is supported by LinguaCafe. To access the Deepl API, you'll need to create an API key, add it in Admin > API > DeepL, and enable the DeepL dictionary.
After that, go to the Admin -> Dictionaries page, and click the Add dictionary button, and select the DeepL dictionary option. Here you can select what language do you want DeepL to translate to. You can add multiple DeepL dictionaries for the same language, if you want it to translate to multiple languages.
Linguacafe supports LibreTranslate. There are multiple ways to set it up, the only requirement is that the webserver container has to be able to reach the LibreTranslate server. With LinguaCafe's default configuration, you can follow these steps to install LibreTranslate.
Create a a folder, and a file inside it named docker-compose.yml
.
Add this configuration to it.
version: "3"
networks:
linguacafe_linguacafe:
external: true
services:
libretranslate:
container_name: libretranslate
image: libretranslate/libretranslate:latest
restart: unless-stopped
ports:
- 5000:5000
networks:
- linguacafe_linguacafe
environment:
- LT_LOAD_ONLY=en,nb,hu
Add the languages you want to use to the config file. You can find a list of available language codes here. You can remove these 2 lines if you want to install every language, but it can take a long time and uses a lot of disk space.
You can change the host of LibreTranslate in the admin settings if you didn't use the provided configuration.
environment:
- LT_LOAD_ONLY=en,nb,hu
Run this command from the location of the created LibreTranslate folder
docker compose up -d
Libre translate now should be working with linguacafe.
If you are a programmer, you can write your own API that LinguaCafe can use as a dictionary. The default host is http://host.docker.internal:1234
, which is the computer your docker runs on.
This is an example request that will be sent to the host. In the future it will be extended with more options, like context and batch translations.
HTTP method: POST
Content-Type: json
{
"q": "hund",
"source": "norwegian",
"target": "english"
}
The API expects a JSON response with an object, that has a translatedText field.
{
"translatedText": "dog"
}
It will return 'test translation' regardless of the request, but you can use this example to write a translation server.
from bottle import route, request, response, run, BaseRequest, HTTPResponse
import json
@route('/', method='POST')
def translation():
response.headers['Content-Type'] = 'application/json'
sourceLanguage = request.json.get('source')
targetLanguage = request.json.get('target')
term = request.json.get('q')
return json.dumps({'translatedText': 'test translation'})
run(host='0.0.0.0', port=1234, reloader=True, debug=True)
If you have a list of words that you already know before you started using LinguaCafe, you can import them from a CSV file.
Note
Changes after importing cannot be reverted, thus make sure you're importing only the words you want LinguaCafe to track.
To import words, go to the Vocabulary page, select the Data dropdown menu, and inside that click on the Import button. On the import dialog you can select your CSV file and a few options:
- Skip first row. If enabled, LinguaCafe skips the first row which could be simply be the column names.
- Only update. If enabled, no new words will be added to the system. This allows you to only update fields for words that you have already encountered in LinguaCafe.
The CSV file can have these columns, in this order:
Column Name | Required | Accepted Values | Comment |
---|---|---|---|
Word | Yes | Any word without any spaces. | |
Translation | No | Can be left empty. | |
Lemma | No | Can be left empty. | |
Reading | No | Can be left empty. | |
Lemma reading | No | Can be left empty. | |
Level | No |
new , ignored , learned , 1 , 2 , 3 , 4 , 5 , 6 , 7
|
Cannot be left empty. |
At least the first column must be present in the CSV file. Any further columns can be added to it in the order showed above. If a column is not provided, those fields will not be changed in the database. However if a column is provided, and it's left empty in a row, it will be overwritten in the database with an empty value.
After the import is complete, you will see a message about the number of created, updated and rejected words.
Anki has to have the AnkiConnect add-on installed, and it has to run on the same PC that LinguaCafe's server runs on.
To set up an Anki's connection, head over to Admin > API > Anki.
Note
Future versions of LinguaCafe won't have this requirement.
Open Anki -> Tools -> Add-ons -> AnkiConnect -> Config. Change the webBindAddress option to "0.0.0.0". So it will look like this:
"webBindAddress": "0.0.0.0",
You can use the network configuration from this example to connect Jellyfin's network with LinguaCafe. There are probably multiple ways to do it, the only requirement is that linguacafe-webserver
should be able to reach Jellyfin's server to make API requests.
version: '3.5'
networks:
linguacafe_linguacafe:
external: true
services:
jellyfin:
image: jellyfin/jellyfin
container_name: jellyfin
user: 1000:1000
volumes:
- /path/to/config:/config
- /path/to/cache:/cache
- /path/to/media:/media:ro
restart: 'unless-stopped'
ports:
- 8096:8096
networks:
- linguacafe_linguacafe
You must name your subtitle files in a way that Jellyfin will recognize as languages. These worked for me:
Series Name - S01E01.ja.ass
Series Name - S01E01.de.ass
Movie name.es.ass
Language codes for subtitle filenames that Jellyfin recognizes:
Language | Language Code |
---|---|
Chinese | zh |
Croatian | hr |
Czech | cs |
Danish | da |
Dutch | nl |
Finnish | fi |
French | fr |
German | de |
Italian | it |
Japanese | ja |
Korean | ko |
Lithuanian | lt |
Macedonian | mk |
Norwegian | no |
Polish | pl |
Portuguese | pt |
Romanian | ro |
Russian | ru |
Slovenian | sl |
Spanish | es |
Swedish | sv |
Thai | th |
Turkish | tr |
Ukrainian | uk |
Welsh | cy |
See Jellyfin external file naming.
- Create an API key in Jellyfin. You can do this on the Dashboard > API Keys menu.
- Set the created API key in LinguaCafe on to the Admin > API menu.
- Set the Jellyfin host in LinguaCafe on to the Admin > API menu. If you used the pre-written configs, it should be the default http://jellyfin:8096.
- Save the settings.
Now you can import subtitles from Jellyfin.
Possible error codes in browser console while importing from Jellyfin:
Error: unsupported language code: spa
This means that Jellyfin recognized the language of the subtitle, but it is not supported by LinguaCafe yet. If you find one of these, please open a GitHub Issue, this should be fixed.
Error: unsupported language code: unrecognized by jellyfin: japaaaneseee
This means that Jellyfin did not recognize japaaaneseee
as a language, and it can only be fixed by renaming the file following Jellyfin's naming conventions.
If you have file naming issues and renamed a file, make sure you refresh metadata in Jellyfin before reloading LinguaCafe.