From b82a0018f08e4a6e59be3c16420029f020ec86e1 Mon Sep 17 00:00:00 2001
From: BeniDage <112965704+BeniDage@users.noreply.github.com>
Date: Tue, 24 Sep 2024 09:29:29 +1000
Subject: [PATCH] MongoDB Connection Documentation (#153)

Co-authored-by: Kaleb <82347290+SassafrasAU@users.noreply.github.com>
---
 .../Data Anonymization/dataanonymization.md   |  80 +++++++------
 .../MongoDb Connection/_category_.json        |   9 ++
 .../MongoDb Connection/mongodbconnection.md   | 108 ++++++++++++++++++
 3 files changed, 159 insertions(+), 38 deletions(-)
 create mode 100644 docs/data-warehousing/MongoDb Connection/_category_.json
 create mode 100644 docs/data-warehousing/MongoDb Connection/mongodbconnection.md

diff --git a/docs/data-warehousing/Data Anonymization/dataanonymization.md b/docs/data-warehousing/Data Anonymization/dataanonymization.md
index deb19bf0f..00f3d13ae 100644
--- a/docs/data-warehousing/Data Anonymization/dataanonymization.md	
+++ b/docs/data-warehousing/Data Anonymization/dataanonymization.md	
@@ -3,6 +3,7 @@ sidebar_position: 3
 ---
 
 # Anonymization and Masking in Healthcare Data
+
 Implementation and Rationale
 
 :::info
@@ -10,62 +11,65 @@ Implementation and Rationale
 **Document Reference:** Data Anonymization. **Expiry Date:** 29 April 2025. **Version:** 1.0.
 :::
 
-
 ## Introduction:
 
-In the realm of healthcare data management, preserving patient privacy and confidentiality is 
-of utmost importance. Anonymization and masking techniques serve as essential tools in 
-safeguarding sensitive information while allowing for meaningful analysis and research. This 
-document elucidates the implementation of anonymization and masking in a heart attack 
+In the realm of healthcare data management, preserving patient privacy and confidentiality is
+of utmost importance. Anonymization and masking techniques serve as essential tools in
+safeguarding sensitive information while allowing for meaningful analysis and research. This
+document elucidates the implementation of anonymization and masking in a heart attack
 prediction dataset and provides insights into the rationale behind their application.
 
 ## Code Implementation:
-The provided code utilizes Python libraries such as Pandas, Faker, and hashlib to anonymize 
+
+The provided code utilizes Python libraries such as Pandas, Faker, and hashlib to anonymize
 and mask sensitive columns within the heart attack prediction dataset. Let's delve into the implementation details
 
 ### Reading the Dataset:
 
-The original dataset is read into a Pandas DataFrame, facilitating data 
+The original dataset is read into a Pandas DataFrame, facilitating data
 manipulation and transformation.
 
-### Initializing Faker: 
+### Initializing Faker:
 
 An instance of the Faker library is initialized to generate fake data for non-sensitive columns.
 
 ### Anonymization and Masking:
 
- - Patient ID: Hashing using SHA-256 ensures irreversible transformation, preserving 
-anonymity while retaining uniqueness.
- - Age: Age values are generalized into ranges to conceal precise age information, enhancing 
-privacy.
- - Binary Attributes: Columns representing binary attributes such as sex, diabetes, smoking, 
-etc., are masked as 'Yes' or 'No' to obscure specific health conditions or behaviors.
- - Heart Attack Risk: Masked as 'High' or 'Low' to conceal exact risk prediction outcomes.
- - Numeric Attributes: Numeric values such as cholesterol, blood pressure, etc., are replaced 
-with random values within a specified range, preventing re-identification while preserving 
-statistical properties.
-
-### Saving the Anonymized Dataset: 
+- Patient ID: Hashing using SHA-256 ensures irreversible transformation, preserving
+  anonymity while retaining uniqueness.
+- Age: Age values are generalized into ranges to conceal precise age information, enhancing
+  privacy.
+- Binary Attributes: Columns representing binary attributes such as sex, diabetes, smoking,
+  etc., are masked as 'Yes' or 'No' to obscure specific health conditions or behaviors.
+- Heart Attack Risk: Masked as 'High' or 'Low' to conceal exact risk prediction outcomes.
+- Numeric Attributes: Numeric values such as cholesterol, blood pressure, etc., are replaced
+  with random values within a specified range, preventing re-identification while preserving
+  statistical properties.
+
+### Saving the Anonymized Dataset:
+
 The anonymized dataset is saved to a CSV file for further analysis and research purposes.
 
 ### Rationale for Anonymization and Masking:
- - Privacy Preservation: Anonymizing sensitive attributes such as patient IDs and masking 
-identifiable information mitigate the risk of unauthorized access and identity disclosure, thus 
-preserving patient privacy.
- - Regulatory Compliance: Adherence to regulations such as HIPAA and GDPR mandates the 
-protection of patient data through anonymization and masking, ensuring compliance and 
-avoiding legal ramifications.
- - Facilitating Research: Anonymized datasets enable researchers and analysts to conduct 
-studies and derive insights without compromising patient privacy, fostering collaboration and 
-innovation in healthcare research.
- - Building Trust: Demonstrating a commitment to protecting patient privacy through 
-anonymization and masking fosters trust among patients, healthcare providers, and regulatory 
-bodies, bolstering the integrity of healthcare data management practices.
+
+- Privacy Preservation: Anonymizing sensitive attributes such as patient IDs and masking
+  identifiable information mitigate the risk of unauthorized access and identity disclosure, thus
+  preserving patient privacy.
+- Regulatory Compliance: Adherence to regulations such as HIPAA and GDPR mandates the
+  protection of patient data through anonymization and masking, ensuring compliance and
+  avoiding legal ramifications.
+- Facilitating Research: Anonymized datasets enable researchers and analysts to conduct
+  studies and derive insights without compromising patient privacy, fostering collaboration and
+  innovation in healthcare research.
+- Building Trust: Demonstrating a commitment to protecting patient privacy through
+  anonymization and masking fosters trust among patients, healthcare providers, and regulatory
+  bodies, bolstering the integrity of healthcare data management practices.
 
 ## Conclusion:
-The implementation of anonymization and masking techniques in healthcare data 
-management is indispensable for preserving patient privacy, complying with regulations, 
-facilitating research, and building trust within the healthcare ecosystem. By anonymizing 
-sensitive attributes and masking identifiable information, organizations uphold ethical 
-standards while harnessing the power of data-driven insights to improve patient outcomes and 
-healthcare delivery
\ No newline at end of file
+
+The implementation of anonymization and masking techniques in healthcare data
+management is indispensable for preserving patient privacy, complying with regulations,
+facilitating research, and building trust within the healthcare ecosystem. By anonymizing
+sensitive attributes and masking identifiable information, organizations uphold ethical
+standards while harnessing the power of data-driven insights to improve patient outcomes and
+healthcare delivery
diff --git a/docs/data-warehousing/MongoDb Connection/_category_.json b/docs/data-warehousing/MongoDb Connection/_category_.json
new file mode 100644
index 000000000..6677d54a7
--- /dev/null
+++ b/docs/data-warehousing/MongoDb Connection/_category_.json	
@@ -0,0 +1,9 @@
+{
+    "label": "MongoDB Connection",
+    "position": 3,
+    "link": {
+      "type": "generated-index",
+      "description": "Documentation for MongoDB Connection "
+    }
+  }
+  
\ No newline at end of file
diff --git a/docs/data-warehousing/MongoDb Connection/mongodbconnection.md b/docs/data-warehousing/MongoDb Connection/mongodbconnection.md
new file mode 100644
index 000000000..85a78e67a
--- /dev/null
+++ b/docs/data-warehousing/MongoDb Connection/mongodbconnection.md	
@@ -0,0 +1,108 @@
+---
+sidebar_position: 1
+---
+
+# MongoDB Connection Server
+
+:::info
+**Effective Date:** 15 August 2024. **Last Edited:** 20 September 2024. **Author:** Ben Dang (Redback Operations).
+**Document Reference:** MongoDB Connection. **Expiry Date:** 15 August 2025. **Version:** 1.0.
+:::
+
+This project is a web server application that connects to a MongoDB database. The setup uses Docker Compose to manage the services.
+
+## Prerequisites
+
+- Docker
+- Docker Compose
+
+## Setup
+
+### 1. Clone the Repository
+
+```sh
+git clone https://github.com/Redback-Operations/redback-data-warehouse.git
+
+cd "MongoDB Connection/Project1"
+
+```
+
+### 2. Create .env at your root directory
+
+- MONGO_URI="mongodb://your_username:your_password@your_host:your_port/?authSource=your_authSource"
+- DB_NAME="your_database_name"
+- COLLECTION_NAME="your_collection_name"
+
+### 3. Run Docker Compose to build the images and run the services:
+
+```bash
+- docker-compose up --build
+```
+
+### 4. View the Application
+
+- Open your browser and navigate to http://localhost:5003/
+
+## Configuring MongoDB and Monitoring Logs
+
+### Changing MongoDB Documents and Collections as needed
+
+- config.py contains the MongoDB connection string.
+- document_model.py contains the MongoDB collection name.
+
+### Check logs application
+
+- All the logs are stored in the logs folder at the root of the project.(app.log)
+
+## API Endpoints
+
+### 1. Get All Documents
+
+- **Endpoint**: `/documents`
+- **Method**: `GET`
+- **Description**: Retrieves all documents from the database.
+- **Response**:
+  - `200 OK`: Returns a JSON array of documents.
+
+### 2. Get Document by ID
+
+- **Endpoint**: `/documents/<document_id>`
+- **Method**: `GET`
+- **Description**: Retrieves a document by its ID.
+- **Parameters**:
+  - `document_id` (path): The ID of the document to retrieve.
+- **Response**:
+  - `200 OK`: Returns the document as a JSON object.
+  - `404 Not Found`: If the document is not found.
+
+### 3. Insert Document
+
+- **Endpoint**: `/documents`
+- **Method**: `POST`
+- **Description**: Inserts a new document into the database.
+- **Request Body**: JSON object representing the document to insert.
+- **Response**:
+  - `201 Created`: Returns a success message and the ID of the inserted document.
+
+### 4. Update Document
+
+- **Endpoint**: `/documents/<document_id>`
+- **Method**: `PUT`
+- **Description**: Updates an existing document by its ID.
+- **Parameters**:
+  - `document_id` (path): The ID of the document to update.
+- **Request Body**: JSON object representing the updated document data.
+- **Response**:
+  - `200 OK`: Returns a success message if the document was updated.
+  - `404 Not Found`: If the document is not found or no changes were made.
+
+### 5. Delete Document
+
+- **Endpoint**: `/documents/<document_id>`
+- **Method**: `DELETE`
+- **Description**: Deletes a document by its ID.
+- **Parameters**:
+  - `document_id` (path): The ID of the document to delete.
+- **Response**:
+  - `200 OK`: Returns a success message if the document was deleted.
+  - `404 Not Found`: If the document is not found.