My name is Max. More than 11 years I worked in mining&geology domain where I developed business in Far East and Siberian regions of Russia, implemented data management systems (more details in my article for Globus magazine -- on page 152). Now I came back to fintech and work as big data engineer. I continuously improve my skills in DA and DE directions.
🌱 ...I've passed Big Data analytics path and Data Engineering path...
In this profile you may see key projects and tasks have been resolved by me during my career and education paths. I've grouped them in next list.
-
Small ETL windows app for converting Excel protocol file to CSV file format with required structure. Stack: pandas, re, tkinter https://github.com/mmingalov/micromine-lab-protocols
-
On start Spider gets 2 links on VK.com user accounts. Algorithm task is: to find the shortest chain of handshakes by composing it from mutual friends. Task was resolved 2 ways: a) Scrapy Spider; b) Recursion Stack: scrapy, mongo db https://github.com/mmingalov/geekbrains-methods-data-collection-from-internet/tree/master/course_work
-
Apache Airflow DAG finds in Rick and Morty API three locations with maximum number of residents and writes results into Greenplum DB table. Stack: greenplum db, airflow https://github.com/mmingalov/kc-airflow/blob/main/dags/m-mingalov/m-mingalov_5_Rick_and_Morty.py
-
In this competition your task will be to predict the mean math exam result (from 0 to 100 points) for students of tutors in test.csv. Metric – determination coefficient. Few solutions were provided in Jupyter notebook file. https://github.com/mmingalov/geekbrains-data-analysis-alg/tree/master/tutors-expected-math-exam-results
-
Model for real estate prices prediction (houses) Price variable is target. Output predictions file includes two columns – Id and Price. Few solutions were provided in Jupyter notebook files. https://github.com/mmingalov/geekbrains-python-data-science/tree/master/course_project
This dashboard I created for tracking a result of my investment deals. https://public.tableau.com/app/profile/maxim.mingalov/viz/IISv3/Dashboard1?publish=yes
MapReduce task with using python https://github.com/mmingalov/kc-hadoop/tree/master/homework_lesson5
Some practice with creating partitioned tables and views for taxi dataset https://github.com/mmingalov/kc-hadoop/tree/master/homework_lesson7
-
Project from my learning in ‘Big Data Analytics’ faculty. Please find detailed description in Powerpoint files (Russian and English versions) and steps of execution in 'final project executing.docx' file. https://github.com/mmingalov/geekbrains-final-project
-
This model rates clients and provides decisions about should we credit them or should not. https://github.com/mmingalov/kc-big-ML/tree/main/4_1_Bank_credit_scoring
-
Model for predicting the value of the maximum loan amount based on client data. https://github.com/mmingalov/kc-big-ML/tree/main/4_2_Bank_credit_rate
-
Pyspark profiler for getting additional statistics of table columns. https://github.com/mmingalov/spark-profiler/