English | 简体中文
At present, BitSail only supports flink deployment on Yarn.
Other platforms likenative kubernetes
will be release recently.
Here are the contents of this part:
Below is a step-by-step guide to help you effectively deploy it on Yarn.
To support Yarn deployment, HADOOP_CLASSPATH
has to be set in system environment properties. There are two ways to set this environment property:
-
Set
HADOOP_CLASSPATH
directly. -
Set
HADOOP_HOME
targeting to the hadoop dir in deploy environment. The bitsail scripts will use the following command to generateHADOOP_CLASSPATH
.
if [ -n "$HADOOP_HOME" ]; then
export HADOOP_CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath)
fi
After packaging, the project production contains a file conf/bitsail.conf. This file describes the system configuration of deployment environment, including the flink path and some other default parameters.
Here are some frequently-used options in the configuration file:
Prefix | Parameter name | Description | Example |
---|---|---|---|
sys.flink. | flink_home | The root dir of flink. | ${BITSAIL_HOME}/embedded/flink |
checkpoint_dir | The path storing the meta data file and data files of checkpoints. Reference: Flink Checkpoints |
"hdfs://opensource/bitsail/flink-1.11/checkpoints/" | |
flink_default_properties | General flink runtime options configued by "-D". | { classloader.resolve-order: "child-first" akka.framesize: "838860800b" rest.client.max-content-length: 838860800 rest.server.max-content-len } |
BitSail only support resource provider
yarn's yarn-per-job
mode until now, others likenative kubernetes
will be release recently.
You can use the startup script bin/bitsail
to submit flink jobs to yarn.
The specific commands are as follows:
bash ./bin/bitsail run --engine flink --conf [job_conf_path] --execution-mode run --queue [queue_name] --deployment-mode yarn-per-job [--priority [yarn_priority] -p/--props [name=value]]
Parameter description
- Required parameters
- queue_name: Target yarn queue
- job_conf_path: Path of job configuration file
- Optional parameters
- yarn_priority: Job priority on yarn
- name=value: Flink properties, for example
classloader.resolve-order=child-first
- name: Property key. Configurable flink parameters that will be transparently transmitted to the flink task.
- value: Property value.
Submit a fake source to print sink test to yarn.
bash ./bin/bitsail run --engine flink --conf ~/bitsail-archive-0.1.0-SNAPSHOT/examples/Fake_Print_Example.json --execution-mode run -p 1=1 --deployment-mode yarn-per-job --queue default
Please check ${FLINK_HOME}/log/
folder to read the log file of BitSail client.
Please go to Yarn WebUI to check the logs of Flink JobManager and TaskManager.
Suppose that BitSail install path is: ${BITSAIL_HOME}
.
After building BitSail, we can enter the following path and find runnable jars and example job configuration files:
cd ${BITSAIL_HOME}/bitsail-dist/target/bitsail-dist-0.1.0-SNAPSHOT-bin/bitsail-archive-0.1.0-SNAPSHOT/
Use examples/Fake_Print_Example.json as example to start a BitSail job:
<job-manager-address>
: the address of job manager, should be host:port, e.g.localhost:8081
.
bash bin/bitsail run \
--engine flink \
--execution-mode run \
--deployment-mode local \
--conf examples/Fake_Print_Example.json \
--jm-address <job-manager-address>
Then you can visit Flink WebUI to see the running job. In task manager, we can see the output of the Fake_to_Print job in its stdout.
Use examples/Fake_hive_Example.json as an example:
- Remember fulfilling the job configuration with an available hive source before run the command:
job.writer.db_name
: the hive database to write.job.writer.table_name
: the hive table to write.job.writer.metastore_properties
: add hive metastore address to it, like:
{ "job": { "writer": { "metastore_properties": "{\"hive.metastore.uris\":\"thrift://localhost:9083\"}" } } }
Then you can use the similar command to submit a BitSail job to specified Flink session:
bash bin/bitsail run \
--engine flink \
--execution-mode run \
--deployment-mode local \
--conf examples/Fake_Hive_Example.json \
--jm-address <job-manager-address>