This is a web-based tool for performing data quality checks on PostgreSQL databases. It allows users to connect to a database, select tables and columns, and run various data quality assessments.
- Connect to PostgreSQL databases
- Select tables and columns for analysis
- Perform the following data quality checks:
- Null check
- Numeric distribution analysis
- Inaccurate data detection (improper characters)
- Data variety assessment
- Web-based interface using Bootstrap for responsive design
Before you begin, ensure you have met the following requirements:
- Python 3.7+
- pip (Python package manager)
- PostgreSQL database for testing
- Clone this repository: git [https://github.com/Steven-Nanga/DataGuard/tree/master] cd DataGuard Copy
- Create a virtual environment (optional but recommended): python -m venv venv source venv/bin/activate # On Windows, use venv\Scripts\activate
- Install the required packages: pip install -r requirements.txt
-
Start the Flask application: python app.py Copy
-
Open a web browser and navigate to
http://localhost:5000
-
Enter your PostgreSQL database credentials and connect
-
Select a table and columns to analyze
-
Choose the data quality checks you want to perform
-
Click "Run Checks" to see the results
app.py
: The main Flask applicationtemplates/index.html
: The HTML template for the web interfacerequirements.txt
: List of Python dependencies
Contributions to the Data Quality Check Tool are welcome. Please follow these steps:
- Fork the repository
- Create a new branch (
git checkout -b feature/your-feature-name
) - Make your changes
- Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin feature/your-feature-name
) - Create a new Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
If you have any questions or feedback, please open an issue on GitHub.