This repository contains information related to the AI Explorers Program Pilots Initiative in OIT, which provided funding for CMS employee-led pilot teams to learn how artificial intelligence (AI) applications could address their business needs.
Goals of the Pilot Initiative
The goals of the pilot initiative are:
- Increase knowledge of AI/ML among CMS employees and help them grow their technological skill sets.
- Test AI solutions at a small scale.
- Accelerate adoption of AI/ML techniques.
- Learn how to bring an AI/ML project from ideation to production readiness within CMS.
- Develop proofs of value and initial minimum viable products.
Instructions on how to use the project, including examples.
Several pilots have been completed by a variety of components, such as OHC, OEDA, OIT, and others are awardees are in progress now. The previous pilots were selected based on the following evaluation criteria:
Use Case Description
The aim of the OHC (Office of Human Capital) Pilot is to enhance the transparency of the hiring process and reduce recruitment effort using USA staffing and recruitment data. The goal is to develop a user-centered prototype from a raw dataset by processing the data through machine learning models and deploying the prototype in the cloud. Furthermore, the process taken to develop the solution should be repeatable and applicable to future datasets.
Objective
The main objective of the OHC Pilot is to demonstrate how machine learning can be applied to the Division's large datasets, resulting in a useful tool that meets employees' needs.
What was Completed
A proof-of-concept was developed, showcasing how machine learning can be applied to large datasets to meet employee needs as a useful tool.
GitHub Repo
[NONE]
Contacts
- Eric Rowe (OHC)
- Status: Completed
Use Case Description
The OEDA (Office of Enterprise Data and Analytics) Pilot aims to analyze the pattern of care that leads to opioid-related hospitalization. Additionally, it aims to understand the timing of opioid type/dose amount/fill dates, timing of medication for opioid use disorder (MOUD), as well as potential non-time-series contributors, such as beneficiary demographics, presence of chronic conditions, and other important confounding factors.
Objective
The primary objective of the OEDA Pilot is to gain a better understanding of the factors that contribute to opioid-related hospitalizations through data analysis.
What was Completed
The project was completed and provided insight into the factors that contribute to opioid-related hospitalizations.
GitHub Repo
[NONE]
Contacts
- James DelAguila
- Status: Completed
Use Case Description
The OIT (Office of Information Technology) Pilot aimed to develop an automated pipeline that provides a machine-readable Automated Technical Profile for CMS systems. The objective is to infer the technology fingerprint of CMS projects based on multiple data sources at different stages of their development lifecycle.
Objective
The goal of the OIT Pilot is to apply AI and data science expertise to identify opportunities in the generation of taxonomy/ontology relating to technology in use at CMS in a standard machine-readable format. Additionally, the project aimed to apply AI/natural language processing (NLP) to multiple data sources to identify matches of the technical composition of the system, and to develop a machine-readable technical profile API with relevant metadata.
What Was Completed
The OIT Pilot resulted in the development of a machine-readable technical profile API with relevant metadata, which included levels of confidence based on the quality of the match. Furthermore, a proof-of-concept of a recommendation engine that reduces burden was explored.
GitHub Repo
Contacts
[CONTACTS]
Use Case Description
The CCSQ (Center for Clinical Standards and Quality) Pilot aimed to Modernize the validation process in Hospital Quality Reporting Program (HQRP) to avoid and prevent human errors on measure score.
Objective
The goal of CCSQ Pilot is to create an intelligent, automated measurement validation system based on AI/ML in Hospital Quality Reporting Program (HQRP). Additionally, the pilot aimed to improve the accuracy of this process and increase the accuracy of published data for the healthcare community by utilizing Artificial Intelligence/Machine Learning techniques to detect anomalies more efficiently by looking at historical data.
What Was Completed
The CCSQ Pilot resulted in the development of three machine learning models – an XGBoost model, an Isolation Forest model, and an ensemble model that both uses both XGBoost and the Isolation Forest model – that were applied to each measure to identify anomalies in the published scores for HQRP based on historical data and predicting the measure scores and predicting the measure scores. Also, RMSE (Root Mean Square Error) was used to measure model performance by comparing actual measurements with predicted measurements.
GitHub Repo
Contacts
- Mark Canfield, [email protected]
- Benjamin Ghahhari, [email protected]
Use Case Description
The OC (Office of Communication) Pilot aimed to identify patterns in the DevOps data to predict issues and improve developer productivity.
Objective
The goal of OC Pilot is to prototype an ML model that can predict the likely location of a bug using historical data in Jira stories and related code commit in GitHub.
What Was Completed
The OC Pilot trained and test dozens of ML models using data from CCXP and MCT to predict the folders that contain files with bugs based on text from Jira. Additionally, the pilot provided recommendations for bug report quality improvement.
GitHub Repo
Contacts
- David Kane, [email protected], WETG/DWPS – COR/Technical Advisor
- Matt Raschka, [email protected], WETG – Technical Advisor
Use Case Description
The CM pilot is aimed to determine the viability of utilizing AI/ML capabilities available in AWS services to analyze email creating a reliable analysis tool for listening to the Voice of the Customer (VoC).
Objective
The goal of CM Pilot is to develop a POC that can identify topics, analyze sentiments and identify trends. The long term vision includes;
- Integrate the POC with Education Training and Communication (ETC)
- Use in IRA
- Use in compliance
- Use in Appeals.
What Was Completed
The CM Pilot cleaned and trained email data that exists in the RDS databases and used in the ML models. The models were used to conduct the following RDS Communication analysis;
- Topic Analysis
- Sentiment Analysis
- Time Series Analysis
- Trend Analysis.
GitHub Repo
Contacts
- Elizabeth McKenna
Use Case Description
The CMCS pilot is aimed to provide tools that improve T-MSIS state technical assistance by empowering T-MSIS technical assistants with predictive data quality metrics on state submitted data.
Objective
The goal of CMCS Pilot includes;
- To create model(s) that predict the number of errors in a state submission for the upcoming month and assess whether it is outside the expected range of values.
- To automate as much as possible in a pilot to overcome volume and variety challenges of the use case.
- To build a light-weight pilot via already available value-add data science tools (Databricks), augmenting the small pilot team.
- To pilot multiple model types, using both traditional python library and Databricks AutoML to evaluate, designating the best choice if predictive performance is acceptable.
What Was Completed
The CMCS Pilot developed a set of re-runnable Databricks-based notebooks were created that generates and evaluates multiple forecast and regression models as well as a simple linear regression model for comparison. Additionally, a QuickSight business intelligence dashboard was created that displays the number of data quality errors (past and predicted) for each state and file type by month.
GitHub Repo
Contacts
- Hirsch Malik
Instructions on how to contribute to the project, including guidelines for pull requests and code reviews.