=======
client.js : Sets up the kafka client that gets associated with the broker and the broker config ,which can be used by producers and consumers
scaletest.js: Currently for demonstration purposes, runs a script that generates 300 happy emojis and 200 sad emojis to a kafka topic ,in message queues of 500ms , also exposes a post endpoint where users can generate their own emoji data! (runs on port 3000)
test1.py : in the sparkstreamlogic directory, this is our main spark structured streaming process, ingests data from the kafka topic , processes it in batches of 2000ms , and writes this to a new kafka topic
publisher.js: this is the main publisher, the first receiver of the processed spark data! acts as a forwarding medium, forwards message to another kafka topic that subscriber clusters can read from (runs on port 3001)
sktcluster2.js: this is one of the subcribing clusters, receives data and sets it up that registered users can access emoji stream through socket connections. Connection to cluster is made by a dynamic allocation code that checks capacity of the cluster and decides.More on this later on in the README! (this process runs on port 3005)
We have a robust serverless postgres instance running on NeonTech, that has the registered users data along with their Tier information that is checked before giving access to emojistream. We also have clusterinfo with a capacity field. Everytime there is a new connection, only a capable cluster is picked for socket connection to enable emojistreaming So we're doing
-
Have Kafka and Spark installed, for extra surity have Findspark installed as well through a pip install
-
Make sure zookeeper is running, and the Kafka Broker is running on port 9092 (if its another port ,configure the same in index.js)
-
Clone this repo!
-
cd to producerlogic and do
npm i
-
You will need the .env credentials to connect to NeonTech, Star this repo and ask anirudhpk01 for the credentials!
-
If you've installed kafka through a script, cd to
/usr/local/kafka
and runbin/kafka-topics.sh --bootstrap-server localhost:9092 --list
to see which are the topics currently set up. You'll need to run this every time you gotta check the topics available -
Now go to index.js and change the topic name to
emoji
, runnode index.js
, change it toemoji-counts
, run it again, again topublisher-emojis
, run it again. This creates all the topics that our processes read from and write to -
Run
node scaletest.js
, CLI should show a bunch of messages continually being generated. If so, its working fine -
cd to sparkstreamlogic and run
python3 test1.py
(or python) , make sure to configure your versions of kafka and spark in this one, and deal with findspark as you wish. If it shows an offset error, Ctrl+c and run it again, it should work! (we are storing offsets in the topics where structured streaming writes to in a checkpoints folder, and this changes everytime you recreate the topic) , if its working, you will see a bunch ofWARN HDFS..
after which aWARN AdminClientConfig
... and then all theWARN ProcessingTimExecutor
messages afte each stage, are the emoji data actually being processed and written. We get this warning cos sometimes there is a sync issue in reading the data, so mild differences from the 2000ms batch processing time occurs. -
cd to producerlogic, run
node publisher.js
, if it shows an error timestamp, Ctrl+c run again! and wait...you should soon see continous influx of processed emoji data in the terminal. This data gets forwarded to publisher-emojis topic -
Run
node sktcluster2.js
, and the other clusters to set up the cluster servers...you can create your own too! just copy the code in sktcluster2.js, paste it into your new cluster, changePORT
andcurrentPort
variables in the code to whichever new available port you wanna run your server on.All set! -
Now for the final part, open Postman, send a POST request to the
/register
request to any of the server's register endpoint! make sure to have a name, tier and password configured in the body of the request (Screenshot below)... we'll work on a frontend soon! -
After succesful registration, just open
index2.html
, rejoice at the crisp frontend, type in your details you registered with, and watch as you get redirected to an available cluster dynamically and view your emoji-stream LIVE as it updates!!
If you wish to integrate our LIVE emoji-stream over YOUR event-driven architecture, Contact us!!