Skip to content
View Sparsh009's full-sized avatar

Block or report Sparsh009

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Sparsh009/README.md

πŸ’» Sparsh's Data Engineering and Big Data Science Projects

Welcome to my GitHub! I'm Sparsh, a Data Engineer with over three years of experience, specializing in Azure, Big Data technologies, and data-driven software solutions. Here, you’ll find projects that showcase my skills in data engineering, ETL processes, machine learning, and video analytics.


πŸ“ About Me

I'm a recent MSc graduate in Big Data Science from Queen Mary University of London and currently working as a Data Science Intern at Assentian Limited. I have a background in Azure Big Data Engineering, SQL Server, and Data Vault modeling. My work focuses on data pipelines, cloud environments, and scalable solutions for data processing.


πŸ“‚ Project Highlights

1. Digitally-Enabled Construction Planning and Management

  • Description: Developed a video analytics-based solution for construction site productivity.
  • Technologies: Computer Vision, YOLOv9, LSTM, Python, Azure
  • Highlights: Real-time personnel and PPE detection; deployed across several UK construction sites.

2. Data Migration and Transformation Project

  • Description: Led a team in designing databases using Data Vault principles and implemented a multi-layer architecture for data migration.
  • Technologies: SQL Server, SSIS, Azure Synapse, Snowflake
  • Highlights: Created robust ETL processes; migrated data seamlessly across different architectures.

3. NYC Rideshare Analysis

  • Description: Built a data pipeline to analyze NYC rideshare data, focusing on traffic patterns and customer demographics.
  • Technologies: Hadoop, Spark, Python, Power BI
  • Highlights: Leveraged big data to derive insights; visualized findings in Power BI.

πŸ› οΈ Technical Skills

  • Programming: Python, SQL, T-SQL
  • Big Data: Hadoop, Spark, MapReduce
  • Databases: SQL Server, Azure Synapse, Snowflake
  • ETL: SSIS, Azure Data Factory
  • Cloud Platforms: Microsoft Azure, AWS
  • Data Modeling: Data Vault, Dimensional Modeling

πŸŽ“ Certifications

  • Generative AI Fundamentals – Databricks
  • Advanced SQL Certification – HackerRank
  • Big Data Engineer Certification – Trendy Tech
  • Global Agile Certification – Infosys

🌐 Connect with Me

Pinned Loading

  1. Activity-Recognition-for-Construction Activity-Recognition-for-Construction Public

    This project implements activity recognition for construction sites using deep learning models, including YOLOv9 for object detection and CNN-LSTM for activity recognition. The goal is to classify …

    Jupyter Notebook

  2. ExplainingAI-for-Construction ExplainingAI-for-Construction Public

    Jupyter Notebook

  3. KitchenWizard KitchenWizard Public

    KitchenWizard is a web application built with Flask, hosted on AWS EC2, allowing users to search for recipes using the Spoonacular API. The app enables users to search for recipes based on dietary …

    Python

  4. Cuisine-Classification-Using-ML Cuisine-Classification-Using-ML Public

    Machine learning pipeline to classify images of dishes as either American or Italian cuisine using the MLEnd Yummy dataset. The pipeline includes data preprocessing, feature extraction, and model t…

    Jupyter Notebook

  5. Object-Detection-for-Contruction Object-Detection-for-Contruction Public

    Jupyter Notebook

  6. Ethereum-Blockchain-Analysis Ethereum-Blockchain-Analysis Public

    Analyzed Ethereum transaction data using Apache Spark to identify key trends in transactions from 2015 to 2019.

    Python