Skip to content

Latest commit

 

History

History
86 lines (63 loc) · 2.78 KB

News-Recommendation-using-Wed-Mining.md

File metadata and controls

86 lines (63 loc) · 2.78 KB
title date type comments categories tags
News Recommendation using Wed Mining
2018-03-27 08:02:08 -0700
categories
true
Portfolios
Python
ReactJS
Node.js
MongoDB
RPC
RabbitMQ
TensorFlow
Shell

I. Introduction

An internet-based news aggregator, providing hot news scraping on popular news sources, with recommendation feature based on users' preference with the help of Machine Learning.

#Github: https://github.com/caomingkai/News_Recommendation_System

Pull it and run it with Shell script!

  • Firstly, run ./launcher.sh:
    • run redis
    • run mongoDB locally
    • start recommendation service(python)
    • start backend service(phython)
    • start web-server service(Node.js + ReactJS)
  • Secondly, run ./news_pipeline_launcher.sh:
    • run redis
    • run mongoDB locally
    • install python requirements
    • start news_topic_modeling_service (python Machine Learning)
    • start news collecting service(data pipeline + web scraping)

II. Tech stack:

  • Front end: React, Express, Node.js, OAuth

    • Built a responsive single-page web application for users to browse news (React, Node.js, RPC, SOA, JWT)
  • Back end: Python RPC, MongoDB, Redis, RabbitMQ

    • Service Oriented, multiple backends serving via JSON RPC
    • Implemented a data pipeline which monitors, scrapes and deduplicates news
  • News recommendation system: Tensorflow, DNN, NLP

    • Designed and built an offline training pipeline for news topic modeling
    • Deployed an online classifying service for news topic modeling using trained model
  • News topic classifying system: TF-IDF, NLP, RabbitMQ

    • Implemented a click event log processor which collects users' click logs, updated a news model for each user

Chart1: Login Page with Authentication {% asset_img loginPage.png This is an image %}

Chart2: News feed page {% asset_img newsPage.png This is an image %}

III. System structure:

  1. Front end tier: React & Node.js
  2. Back end tier: providing RPC API for communication among different tiers
  3. News recommendation system: time decay algorithm
  4. News topic classifying system: Tensorflow with 2-layer CNN model for classification
  5. data pipeline: get news sources
    • News monitor: gets news from News.API
    • News scraper: web scraper
    • News deduper: news TF-IDF deduplication

Chart 3: System with Machine Learning module chart

Chart 4: System with Recommendation module chart

Chart 5: Service dependency chart