Skip to content

Databus 2.0 Example

Chavdar Botev edited this page Feb 15, 2014 · 4 revisions

Build System

Databus currently uses gradle version 1.0, available at gradle.org. The following commands will come in handy (but see README.md for more detail):

$ gradle -Dopen_source=true assemble : Builds the jars and command-line packages that help start a relay and a client. It also builds an example set which can be used as reference while writing your own client.

$ gradle -Dopen_source=true test : Runs the unit tests that accompany the source code.

You can also combine targets in a single command, such as:

$ gradle -Dopen_source=true clean assemble test : Cleans (removes) any previous build artifacts, builds them again, and then runs the unit tests.

(The -Dopen_source=true part is clunky; please bear with us until we come up with a cleaner way to handle it.)

Example Relay

A sample Databus Relay implementation is available in PersonRelayServer.java. To be able to start a relay process, the code is packaged into a startable command-line package. The tarball will be found at build/databus2-example-relay-pkg/distributions/databus2-example-relay-pkg.tar.gz after gradle assemble. This relay is configured to get change-streams for a view Person.

After extracting the tarball into a new directory, cd into that directory and start the relay using the following command:

$ bin/start-example-relay.sh person

If the relay is started successfully, the output of the following curl command would look like:

$ curl http://localhost:11115/sources
[{"name":"com.linkedin.events.example.person.Person","id":101}]

Example Client

A sample Databus Client is available in PersonClientMain.java. A sample consumer implementation can be found in PersonConsumer.java. To be able to easily start the client process, the code is packaged into a startable command-line package. The tarball will be found at build/databus2-example-client-pkg/distributions/databus2-example-client-pkg.tar.gz after gradle assemble. This client is configured to get data from the relay started previously and to subscribe to table Person.

After extracting into the same directory as for the relay, cd into that directory and start the client using the following command:

$ bin/start-example-client.sh person

(You can ignore any warnings about “log4j:WARN No appenders could be found for logger …”) If the client successfully connects to the relay we created earlier, the relay stats will show that a client from localhost has connected:

$ curl http://localhost:11115/relayStats/outbound/http/clients
["localhost"]

Configuring Oracle to output change-streams to Databus (‘Databusification’)

Create a username, password, database

Starting in the directory from which you ran gradle assemble:

$ cd db/oracle/bin
$ ./createUser.sh person person DB tbs_person /mnt/u001/oracle/data/DB > /tmp/createUser.out

(You can ignore any first-time errors about “ORA-00959: tablespace ‘TBS_PERSON’ does not exist”; the script is designed to be run repeatedly.)

Create a schema for user ‘person’

$ ./createSchema.sh person/person@DB ../../../databus2-example/db/person/ > /tmp/createSchema.out

You may wish to make a note of the time at which you ran the createSchema.sh script; it will be relevant when looking at the relay log below.

Use sqlplus to view the contents of the Databus-specific tables

$ sqlplus person/person@DB

SQL*Plus: Release 10.2.0.4.0 - Production on Tue Dec 4 11:02:45 2012

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning and Data Mining options

View the tables that are created (as part of ‘Databusification’)

SQL> desc sy$txlog;
 Name                                      Null?    Type
 ----------------------------------------- -------- ---------------
 TXN                                       NOT NULL NUMBER
 SCN                                       NOT NULL NUMBER
 MASK                                               NUMBER
 TS                                        NOT NULL TIMESTAMP(6)

SQL> desc sy$sources;
 Name                                      Null?    Type
 ----------------------------------------- -------- ---------------
 NAME                                               VARCHAR2(30)
 BITNUM                                    NOT NULL NUMBER

SQL> select * from sy$sources;
 NAME                           BITNUM
 ------------------------------ ----------
 person                         0

SQL> desc sy$person;
 Name                                      Null?    Type
 ----------------------------------------- -------- ---------------
 TXN                                                NUMBER
 KEY                                       NOT NULL NUMBER
 FIRST_NAME                                NOT NULL VARCHAR2(120)
 LAST_NAME                                 NOT NULL VARCHAR2(120)
 BIRTH_DATE                                         DATE
 DELETED                                   NOT NULL VARCHAR2(5)

SQL> exit

Generate events into the table

Again starting in the directory from which you ran gradle assemble:

$ cd databus2-example/database
$ chmod 755 *.sh
$ ./loadPersons.sh person/person@DB 300 1 load > /tmp/loadPersons.out
$ ./loadPersons.sh person/person@DB 200 302 load > /tmp/loadPersons.out2
$ sqlplus person/person@DB

SQL*Plus: Release 10.2.0.4.0 - Production on Tue Dec 4 11:05:03 2012

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning and Data Mining options

SQL> select count(*) from sy$person;

  COUNT(*)
----------
       500

SQL> exit

Verify that the Databus Relay and Client saw the events

Change into the directory into which you unpacked the two tarballs, then:

$ less logs/relay.log
$ less logs/client.log

Don’t panic! The relay log may initially have periodic SQLSyntaxErrorExceptions due to the lack of the ‘person’ table, but it should quiet down after you run the createUser and createSchema scripts above. (Hence the comment above about noting the time at which the latter was run.) After generating some events with the loadPersons script, you should see a handful of INFO lines with EventProducerThread_person in them, including two that mention src:com.linkedin.events.example.person.Person.

The client log should be around 500 lines long (if there’s an error about failing to create a checkpoint directory, you can ignore it) and should list the firstName, lastName, birthDate, and deleted fields for each event generated by loadPersons.sh. (The names and dates are randomly generated, and on top of that, the dates are simply displayed as long integers, so don’t expect particularly realistic values.)

Shut down the example client and relay

To shut down the client and relay, run the following commands from the directory into which you unpacked the tarballs above:

$ bin/stop-example-client.sh person
$ bin/stop-example-relay.sh person

Create Avro records for the table schema

As written, the example client (PersonConsumer.java) simply uses GenericRecord and hard-coded lookup and typecasting of fields for events. In general, though, you’ll want to use specific, Avro-generated records instead. To generate the relevant code, do the following.

After building (gradle assemble in the first step above), unpack build/databus2-cmdline-tools-pkg/distributions/databus2-cmdline-tools-pkg.tar.gz into a directory (can be the same one used for the sample relay and client above), and cd into that directory. Then (after editing <your_output_dir> to your own temporary directory for the following commands), do:

$ mkdir -p <your_output_dir>/schemas_registry

$ mkdir -p <your_output_dir>/databus2-example-events/databus2-example-person/src/main/java

$ bin/dbus2-avro-schema-gen.sh -namespace com.linkedin.events.example.person -recordName Person \
    -viewName "sy\$person" -avroOutDir <your_output_dir>/schemas_registry -avroOutVersion 1 \
    -javaOutDir <your_output_dir>/databus2-example-events/databus2-example-person/src/main/java \
    -userName person -password person
Processed command line arguments:
recordName=Person
avroOutVersion=1
viewName=sy$person
javaOutDir=<your_output_dir>/databus2-example-events/databus2-example-person/src/main/java
avroOutDir=<your_output_dir>/schemas_registry
userName=person
password=person
namespace=com.linkedin.events.example.person
Generating schema for sy$person
Processing column sy$person.TXN:NUMBER
Processing column sy$person.KEY:NUMBER
Processing column sy$person.FIRST_NAME:VARCHAR2
Processing column sy$person.LAST_NAME:VARCHAR2
Processing column sy$person.BIRTH_DATE:DATE
Processing column sy$person.DELETED:VARCHAR2
Generated Schema:
{
  "name" : "Person_V1",
  "doc" : "Auto-generated Avro schema for sy$person. Generated at Dec 04, 2012 05:07:05 PM PST",
  "type" : "record",
  "meta" : "dbFieldName=sy$person;",
  "namespace" : "com.linkedin.events.example.person",
  "fields" : [ { 
    "name" : "txn",
    "type" : [ "long", "null" ],
    "meta" : "dbFieldName=TXN;dbFieldPosition=0;"
  }, {
    "name" : "key",
    "type" : [ "long", "null" ],
    "meta" : "dbFieldName=KEY;dbFieldPosition=1;"
  }, {
    "name" : "firstName",
    "type" : [ "string", "null" ],
    "meta" : "dbFieldName=FIRST_NAME;dbFieldPosition=2;"
  }, {
    "name" : "lastName",
    "type" : [ "string", "null" ],
    "meta" : "dbFieldName=LAST_NAME;dbFieldPosition=3;"
  }, {
    "name" : "birthDate",
    "type" : [ "long", "null" ],
    "meta" : "dbFieldName=BIRTH_DATE;dbFieldPosition=4;"
  }, {
    "name" : "deleted",
    "type" : [ "string", "null" ],
    "meta" : "dbFieldName=DELETED;dbFieldPosition=5;"
  } ] 
}
Avro schema will be saved in the file: <your_output_dir>/schemas_registry/com.linkedin.events.example.person.Person.1.avsc
Generating Java files in the directory: <your_output_dir>/databus2-example-events/databus2-example-person/src/main/java
Done.

The generated code can then be integrated into your own application.