MapR Stream sample job

Monday, April 18, 2016

MapR Stream sample job

This is a step-by-step instructions on how to run a sample job based on MapR Stream documentation to help users get started with MapR Stream on MapR 5.1.

Solutions:

To avoid any permission issue, below commands are all done by "root" user on MapR Cluster nodes.

1. Create a stream

maprcli stream create -path /stream/s1

2. Create a topic "info" inside above stream

maprcli stream topic create -path /stream/s1 -topic info

3. Prepare the java code and pom.xml for producer and consumer

Here my source code is here: https://github.com/viadea/MapRStream

4. Compile producer and consumer

git clone git@github.com:viadea/MapRStream.git
mvn clean install

Assume the location of jar file built is here:
/mapr/my2.cluster.com/github/MapRStream/target/mapr-streams-examples-1.0-SNAPSHOT.jar

5. Launch consumer

export MAPR_CLASSPATH=/mapr/my2.cluster.com/github/MapRStream/target/mapr-streams-examples-1.0-SNAPSHOT.jar
mapr openkb.stream.SampleConsumer

Note: The poll timeout setting is set to 10 seconds in this sample code, so please launch producer in 10 seconds after consumer is launched.

6. Launch producer

export MAPR_CLASSPATH=/mapr/my2.cluster.com/github/MapRStream/target/mapr-streams-examples-1.0-SNAPSHOT.jar
mapr openkb.stream.SampleProducer

Note: 500000 messages will be sent and then the producer will complete.

7. Check the size of the topic in that stream

# maprcli stream topic list -path /stream/s1
topic  partitions  logicalsize  consumers  maxlag  physicalsize
info   1           11616256     0          0       5390336

As shown above, the current physical disk size for this stream is about 5MB.

Monday, April 18, 2016