An Android app that analyzes data stream from a temperature sensor. This is the server program running on PC or Mac.
The client program is in Data-Processing-Android repo.
- Added new rules to temperature data analysis. Now it can determine if the data reflect a cold, hot, comfortable or strange day (strange means the temperature difference is very large, i.e. it can both be hot and cold in one day).
- Improved the temperature sensor simulation. It can simulate various kinds of climate now.
- Multithreading bugs fixed. Now the analysis result is always corresponding with the current round of temperature sensor data.
Temperature data is processed on server, with the analysis result sent back to client. Client and server communication is through Socket in Java via TCP/IP. Two programs are simultaneously running on the server: Data-Producer
receivers data coming from the Android device and sends them to Apache Kafka server(s). Meanwhile, Data-Processing-PC
analyzes data in real time by Apache Spark Streaming, and sends results back to Android device.
Apache Kafka supports distributed data management by multiple servers and is fault-tolerant, and Apache Spark Streaming supports parallel data processing. Therefore, this architecture is capable of processing large data streams.
The guide uses Mac OS X Yosemite and Smartisan YQ603 running Android 5.1.1 as example. The Android app should also be able to run on Emulators in Android Studio.
The server and client should be better in the same local network, to avoid external/internal IP issues and ensure reachability.
- Download this repository. If you do not have Eclipse installed on computer, download the latest version from here.
- Open Eclipse. When prompted to select Workspace, choose the root path of this repo. There should be two projects shown in Eclipse's
Project Manager
:Data-Processing-PC
andData-Producer
. - Update the IP addresses of the computer and the Android device in the source file
Data-Processing-PC/src/com.ucla.max.DataProcessing/DataProcessing.java
andData-Producer/src/com.ucla.max.DataProducer/DataProducer.java
. There should be two global constants namedPC_IP
andANDROID_IP
. Make sure thePORT
used in data transfer is not occupied by other programs. - Select project
Data-Processing-PC
inPackage Manager
; selectRun - Run As.. - Maven Build
. This should automatically build the program byIf not, do these manually inmvn clean mvn generate-sources mvn install
Run - Run As...
menu. Do the same toData-Producer
project.
- Download Apache Kafka binary program with Scala 2.11 from here. Open a Terminal and
cd
into the download directory. - Run the following commands to set up a zookeeper and Kafka server, and create a topic called
temperature
to store data from temperature sensor. For simplicity, here we create a standalone Kafka server with no fault-tolerance.tar -xzf kafka_2.11-0.9.0.1.tgz cd kafka_2.11-0.9.0.1 bin/zookeeper-server-start.sh config/zookeeper.properties bin/kafka-server-start.sh config/server.properties bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic temperature
- We can open two other terminal windows for Kafka producer and consumer, to monitor the data on Kafka server.
Detailed instructions on Apache Kafka can be found on this Quick Start guide.
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic temperature bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic temperature --from-beginning
- Open a terminal window and
cd
into Eclipse Workspace. First run theData-Processing-PC
program:cd Data-Processing-PC/ mvn exec:java -D exec.mainClass=com.ucla.max.DataProcessing.DataTransfer > ~/Desktop/DataProcessing-output.txt
The standard output is piped to a text file which makes it easier to examine incoming data on Kafka server. the program may generate some Exception warning messages that can be ignored. If the output file shows Build Failure
, it is probable that the port for communication is already in use, or a previous instance of this program is not exited. Restart the computer, rebuild the Maven project and try again.
- Open another terminal window in the Workspace, and run the
Data-Producer
program:cd Data-Producer/ mvn exec:java -D exec.mainClass=com.ucla.max.DataProducer.DataProducer
- If Android Studio is not already installed, download it from here.
- Download the Android app project from Data-Processing-Android repo. Import this project into Android Studio.
- Open
AndroidData - app - src - main - java - com - ucla - max - androiddata - MainActivity.java
file, and change the global variablePC_IP
andANDROID_IP
to the current IP addresses of server and client. Make surePORT
is not used by other applications. - Connect your Android device to the computer; make sure its USB debugging mode in
Developer Settings
is enabled. In Android Studio, clickRun - Run "app"
and select the device to run the app. - After clicking the button in the app, some simulated temperature data should appear on the left, and analysis results of these data will be shown on the right as log.
Simply search "ip" on Google.
Spark Programming Guide
Spark Streaming
Spark Streaming + Kafka Integration Guide
- We plan to add more rules on Apache Spark for processing temperature data, to provide more meaningful results.
- Currently the server is deployed on local machine. We could move it to remote servers such as Amazon EC2, and add multiple Kafka servers to increase processing power and support fault-tolerance.