The cQube product can be installed in a two-step installation process. cQube installation package is divided into below 2 components
- cQube base
- cQube base contains all cQube required softwares installations.
- cQube workflow
- cQube workflow has all the Business logics implementations.
cQube base installation: The cQube base installation installs the complete cQube stack, which includes Ansible automation scripts to install Java, Python, NIFI, Angular, Chart.JS, Leaflet and PostgreSQL installations. All the software's will be installed without any data or data processing units. S3 emission data, S3 input bucket, S3 output bucket and the remaining S/W configurations will be taken care by cQube base installation.
cQube Workflow installation: The cQube Workflow installation is the top-up layer of the cQube base installation. Once the cQube base installation is completed, Admin has to start the cQube workflow installation.
Based on the data source configuration the following are setup
- Database tables will be created at PostgreSQL from the relevant data source SQL files.
- NIFI data processor groups will be created by using the relevant data source NIFI templates and the parameters will be updated based on the configurations set.
- The reports will be shown for the enabled data source in angular dashboard.
cQube - Configurations
Configurations are done as a part of the installation process by the ansible scripts. All the properties of the sensible information like URLs and passwords are saved in the ansible properties file which is encrypted by default. Configuration is an automated process; no manual interactions are required in this stage.
cQube configurations through Ansible code:
- NIFI template would be uploaded & instantiated with the variables & parameters.
- All the NIFI controllers will be enabled and all the NIFI processors will be started.
- Postgres static, transaction, aggregation tables will be created in the newly created database.
The below configurations should be done in cQube while the installation and the data process.
- Enable / Disable the process groups - At installation time
- Keycloak two factor authentication - Need to verify for the roles Admin and Report viewer to go with 2 factor authentications at the time of login.
- NIFI process configurations - Change the query parameters for NIFI process and verify whether those are affecting the process or not
- Infrastructure configuration - Need to check if the selected infra values are affecting or not.
The above four configurations in explanation,
This configuration should be performed at the installation stage of cQube. The state will be specified in the config.yml file. The state code will be specified with 2 letters of uppercase characters which are referred from the below file.
https://github.com/project-sunbird/cQube/blob/release-1.7/ansible/installation_scripts/state_list
For the states who do not require the map reports, we have added a new feature which disables the map reports and remains with other existing reports.
To enable this feature, edit the config file as mentioned below before starting the installation or upgradation.
nano config.yml
add 'none' as value to 'map_name' variable.
map_name : 'none'
After successful installation or upgradation, map reports will be disabled. Now school master data can have null/empty values in latitude and longitude columns.
This configuration should be performed at the installation / Up-gradation stage of cQube. The cQube user may select the process groups which they wanted to include into the cQube. While the installation / Up-gradation of the cQube user will be available to select the existing process group by giving the true / false in the datasource_config.yml document which is available at the link below.
All the configuration should be given like below,
crc: true
attendance: true
infra: true
diksha: true
telemetry: true
udise: true
pat: true
composite: true
progresscard: true
teacher_attendance: true
data_replay: true
sat: true
As of now there are a total of 12 processor groups available in cQube.
Keycloak will allow single sign-on with Identity and Access Management services to cQube. Keycloak will be configured to cQube at the time of installation in the execution process of install.sh command. If two factor authentication is set up the below screen would appear after first time login for all the users. Users need to set up two factor authentication for the first time from his/her mobile using the Google authenticator.
Below are steps:
- Download Google Authenticator app into your mobile.
- Scan the QR code to integrate cQube dashboard with Google authenticator app.
- Provide the OTP to login.
Session time should be mentioned in the config.yml File during the installation / Upgradation.
The default time for the session expiry would be 7 days. Users have the facility to decrease the session out time to a minimum of 30 minutes and maximum of 3650 days.
Minutes should be mentioned as ‘M’ and Day's mention ‘D’.
Example: 7D for 7 days
30M for 30 minutes
The NIFI configuration process allows us to change the query parameters before the installation process starts. To change the query of a NIFI processor, we need to update the query parameter of that processor.
- Query should be re constructed outside the cQube.
- Query parameters should be defined in the configuration files of the NIFI processor group.
- Additional filters can be added in the query parameter.
- Cast, round other functions can be used to update the query in the parameter configuration.
- The query in the NIFI process should be replaced with the new Query.
- The Query results should be affected on UI Reports.
- Below are a few examples to add or remove filters & apply functions to the query.
Example:
- Adding a filter to the existing query for the static processor
By default, the static_get_invalid_names parameter in static_data_parameters.txt would be
The Original Query:
"static_get_invalid_names":'''select school_id,school_name,block_id,district_id,cluster_id from school_hierarchy_details where cluster_name is null or block_name is null or district_name is null
The Updated Query:
"static_get_invalid_names":'''select school_id,school_name,block_id,district_id,cluster_id from school_hierarchy_details where cluster_name is null or block_name is null or district_name is null or school_name is null''',
Output: By adding the additional filter to not allow the school_name with null records into cQube reports.
- Updating infra parameter based on active infrastructure attributes
By default, the infra_normalize parameter in infra_parameters.txt would be
The Original Query parameter:
"infra_normalize":'''select school_id ,
case when HaveDrinkingWater <>1 then 0 else 1 end as drinking_water,
case when NoOfToilet=0 or NoOfToilet is null then 0 else 1 end as toilet,
case when HaveCWSNToilet <>1 then 0 else 1 end as cwsn_toilet,
case when HaveElectricity <>1 then 0 else 1 end as electricity,
case when HaveCCTV <>1 then 0 else 1 end as cctv,
case when HaveLibrary <>1 then 0 else 1 end as library from flowfile'''
The above query parameter can be changed according to the data fields activated by the state.
If only three of the fields choose to be activated by the state the query needs to be updated as below.
The Updated Query parameter:
"infra_normalize":'''select school_id ,
case when NoOfToilet =0 then 0 else 1 end as toilet,
case when HaveElectricity<>1 then 0 else 1 end as electricity,
case when solarpanel_yn=TRUE then 1 else 0 end as solar_panel from flowfile'''
We need to update the query in the infrastructure configuration file i.e., infra_parameters.txt file after that the installation process can be started. This will map with the infrastructure input data with the cQube database tables.
cQube is having the flexibility to support multiple state data with minimal changes in UI code for the fluctuated data sources like Infrastructure, Infrastructure score weights, Semester assessment subjects.
Below Steps will be performed to implement the database configuration stage:
- Before cQube is installed, the infrastructure data source needs to be configured depending on the emission data fields required by the state in the infrastructure configuration file.
- Depending on the infrastructure attributes and its datatype the infrastructure query parameter, a case statement is written to map the input data field and the cQube infrastructure table.
- The case statement needs to be updated in the infrastructure configuration file infra_parameters.txt file by updating the parameter: infra_normalize.
- From the below table the infra_normalize parameter will look like:
"infra_normalize":'''select school_id ,
case when NoOfToilet =0 then 0 else 1 end as toilet,
case when HaveElectricity<>1 then 0 else 1 end as electricity,
case when solarpanel_yn=TRUE then 1 else 0 end as solar_panel from flowfile''',
- Example table :
datatype | input data | infrastructure_master.csv | case statement |
---|---|---|---|
integer | NoOfToilet | Toilet | case when NoOfToilet =0 then 0 else 1 end as toilet, |
bit | HaveElectricity | Electricity | case when HaveElectricity<>1 then 0 else 1 end as electricity, |
boolean | Solar Panel | Solar Panel | case when solarpanel_yn=TRUE then 1 else 0 end as solar_panel, |
Note: For infrastructure_master column the values should be converted to lowercase and spaces are converted to underscore(_) example : Solar Panel converted to solar_panel
- Based on the configuration, the changes are made to handle state specific data fields
- After successful validations, data tables are created with only active and required data fields for the active data fields are processed and metrics are generated
- The metrics generated will be stored in JSON files, which is used for visualization
- Change only the score & status of the infrastructure in infrastructure_master.csv, don't update any other column values.
- Make sure the sum of score of all infrastructure is 100 (For status=1(active))
To select the infrastructure fields, please fill the details in infrastructure_master.csv file which is available at the link below.
The infrastructure calculation will happen as in the below example.
If a school has drinking water, handwash, electricity, toilet, playground, hand pumps, library then the score would be calculated as below
$$201 + 101 + 101 + 201 + 201 + 101 + 10*1 = 100$$
The total infrastructure score would be 100, if the school does not have any infrastructure available, then for that infrastructure it would be awarded with 0.
If the school does not have playground, handpump, library, the score would be calculated as
$$201 + 101 + 101 + 200 + 201 +100 + 10*0 = 60$$
The total infrastructure score would be calculated to 60.
If we need to calculate the score at different levels such as district, block, cluster, the school infrastructure count would be added for all the blocks, clusters, districts.
To get the infrastructure score at district, block, cluster add all the infrastructure of all the schools available.
For example, if there are two schools in cluster the infrastructure score would be calculated as below$$20*(1+1)/2 + 10*(1+0)/2 + 10*(1+1)/2 + 20*(0+0)/2 + 20*(1+1)/2 +10*(0+0)/2 + 10*(0+0)/2 = 55$$
Infrastructure score for the cluster would be 55.
The Metrics will be stored in the JSON files for infrastructure visualization. An example JSON document for the district infrastructure is given below:
{
"district":{
"id":2101,
"value":"test_district_name"
},
"block":{
"value":"test_block_name",
"id":210107
},
"infra_score":{
"value":"79.00"
},
"average":{
"value":"170",
"percent":"79.81"
},
"total_schools":{
"value":213
},
"total_schools_data_received":{
"value":"213"
},
"hand wash":{
"value":"213",
"percent":"100.00"
},
"solar_panel":{
"value":"203",
"percent":"95.31"
},
"library":{
"value":"210",
"percent":"98.59"
},
"drinking_water":{
"value":"13",
"percent":"6.10"
},
"tap_water":{
"value":"13",
"percent":"6.10"
},
"hand_pumps":{
"value":"158",
"percent":"74.18"
},
"playground":{
"value":"159",
"percent":"74.65"
},
"news_paper":{
"value":"213",
"percent":"100.00"
},
"digital_board":{
"value":"3",
"percent":"1.41"
},
"electricity":{
"value":"207",
"percent":"97.18"
},
"toilet":{
"value":"213",
"percent":"100.00"
},
"boys_toilet":{
"value":"213",
"percent":"100.00"
},
"girls_toilet":{
"value":"213",
"percent":"100.00"
}
}
Steps to be taken in case of upgradation:
- In case of upgradation the infrastructure score and status need to be updated in the cQube/development/postgres/infrastructure_master.csv as per the existing active fields and weights configured in the previous release.
cQube is having the flexibility to support configuring multiple indices and its metrics with minimal changes in UI code for UDISE indices and metrics, UDISE score weights and there are 32 input data files updated in this page that need to be emitted to visualize the UDISE report.
Below Steps needs to be performed to implement the UDISE configuration stage:
- Before cQube is installed, UDISE data source needs to be configured depending on the emission data fields required by the state in the UDISE configuration file.
- During configuration users can activate/deactivate the indices and metrics status and their corresponding weights, but not the key columns (ex: id,description,column,type,indice_id should not be modified/edited). Only fields like status and score are updated based on the active indices, metrics and their weights.
- In configuration stage fields will be activated/deactivated for that state, based on requirement
- We can also create our own indices and metrics during the UDISE configuration stage.
- There is an exhaustive list of calculated metrics available by default in cQube which is calculated from the UDISE raw input tables.
- The exhaustive calculated metrics can be used for creation of a normalized metric.
- We can create any number of normalized metrics from the exhaustive list by choosing to add any of the calculated metrics.
- The normalized metric can be defined under
- Newly created index with new normalized metrics
- Newly created index with few new normalized metrics and few existing normalized metrics with different name and metric_id.
- Existing index with new normalized metrics
- After successful validations the active metrics and indices are generated.
- The metrics/indices generated will be stored in JSON files, which is used for visualization
- There are 3 types of directions and they are No, Forward, Backward, based on the metric the direction can be configured. Based on this normalization takes place between 1 to 0 for backward metrics and between 0 to 1 for forward metrics.
- If the schools are not having any particular metrics, the total weights(denominator) are considered for the available schools. Here is an example in the Metric level configuration sheet.
Check udise_config_example sheet for examples of above three possible configurations.
To select the UIDISE indices, metrics please fill the details in the udise_config.csv file which is available at the link below.
Note:
- For providing weights the sum of score of all active indices should be 100
- And also, the score of all active metrics of each active indices should be 100
- Change only status and score column from the udise_config.csv file, depending on the use case.
- While opening udise_config.csv file in excel, please use ‘|’ as a delimiter.
- For newly created metrics/indices the field metric_config is updated as value 'created' in udise_config.csv file to differentiate the existing static metrics/ indices with newly created metrics/indices
The indices, metrics normalization and calculations will happen as in the below example.
If a state selected community participation and medical index indices and their metrics as below
id | description | column | type | indice_id | status | score |
---|---|---|---|---|---|---|
3000 | community participation | Community_Participation | indice | 1 | 60 | |
3001 | % SMC members provided training | cp_smc_members_training_provided | metric | 3000 | 1 | 40 |
3002 | Total meetings held by SMC | cp_total_meetings_held_smc | metric | 3000 | 1 | 40 |
3003 | SMDC in school | cp_smdc_school | metric | 3000 | 1 | 20 |
7000 | medical index | Medical_Index | indice | 1 | 40 | |
7001 | Medical Check-up Conducted | med_checkup_conducted | metric | 7000 | 1 | 60 |
7002 | De-worming Tablets | med_dewoming_tablets | metric | 7000 | 1 | 20 |
7003 | Iron Tablets | med_iron_tablets | metric | 7000 | 1 | 20 |
The total infrastructure score will be calculated using below formulas:
- Before start doing calculation, we will normalize the data by grouping them in quartiles as in the following document ,udise metric calculation logic
- Community participation $$(cp)= (cpm140)/100+(cpm240)/100+(cpm3*20)/100$$
- Medical index $$(med)= (medm160)/100+(medm220)/100+(medm3*20)/100$$
- infra_score= $$(cp60)/100+(med40)/100$$
cQube has the facility to combine the information of all reports and showing it in a single scatter plot and such report is called Composite report, Composite report is the combination of multiple reports which is used to co-relate the information between data sources in school, cluster, block, district level. Example: We can compare the attendance and semester performance of the school in a single scatter report.
- This configuration should be performed at the installation / Up-gradation stage of cQube.
- The user may select the process groups which they wanted to include into the cQube. The metrics available in the selected processor groups will be used in the composite report.
- If composite report is required to be enabled, make sure the [ nifi_comp: true ] processor group is enabled in composite_enable.
Diksha configuration for progress-exhaust & summary-rollup dataset:
We have integrated Diksha progress-exhaust & summary-rollup dataset through the API with cQube. During the installation/upgradation user needs to configure the Diksha production base_url, token, encryption key to progress-exhaust & summary-rollup datasets. We need to configure the cQube/development/python/cQube-raw-data-fetch-parameters.txt file before upgradation or installation process with the production parameters.
Diksha Summary-rollup dataset:
The diksha summary-rollup dataset contains the course & textbook usage data and daily files are downloaded from the Diksha API configured in the above step directly to cQube s3 emission bucket. Once the summary-rollup CSV file is downloaded to the emission bucket the data processing takes place and the visualization reports will populate.
Diksha progress-exhaust dataset:
The diksha progress-exhaust dataset contains the course enrollment & completion data. This dataset is downloaded based on the batch_id passed to the diksha API. The list of batch_id's needs to be emitted in a CSV file through the emission API. All the data related to batch_id's are downloaded to the emission bucket. Once the progress-exhaust zip file is downloaded it is unencrypted using the configured encryption password and stored into an emission bucket. The data processing takes place and the visualization reports will populate.
Diksha columns will be configured at the Installation / Upgradation stage. The summary-rollup dataset has two columns 1. content_gradelevel 2. collection_gradelevel. If the columns are available in the summary-rollup dataset exhaust, the "diksha_columns" parameter needs to be true in the config.yml & upgradation_config.yml else if the columns are not available the diksha_columns parameter needs to be false.
If the columns are true, Diksha “Usage by user profile” report will be enabled from the dashboard else the report will be diabled if the configuration value is false at config.yml & upgradation_config.yml.
Static table configuration (UDISE/State tables): During the installation UDISE/State specific tables can be configured.
The static table configuration is enabled for below tables
- district_master
- block_master
- cluster_master
- school_master
Based on the availability of state/UDISE tables, accordingly it needs to be configured in config.yml before installation.
Once after installation, if the configuration needs to be changed, the user can reinstall cQube.
All the data will be lost if the user reinstall's cQube. the aggregated table data & s3 output files will be wiped off.
Users need to remit the data files to generate the aggregate tables & s3 output files.
During the upgrade the configuration cannot be performed. Only for release-1.8.1 the configuration has been enabled to change from state to UDISE and vice versa. For subsequent release this option would be disabled.
SAT configuration will happen from the emission file.
There are 2 files which need to be emitted to configure the Semester Assessment Test, they are semester_exam_mst, semester_exam_qst_mst.
The semester_exam_mst file needs to be updated with the following details: exam_id, assessment_year, standard, subject, exam_code, total_questions, total_marks & semester information.
The other file is semester_exam_qst_mst and it needs to be updated with question_id, exam_id, question_title, question_marks, indicator_id, indicator_title, indicator.
Please refer to the emission fields document for the fields details semester_exam_mst, semester_exam_qst_mst.
Users can emit these 2 files with incremental load whenever there is a new exam conducted. Once a user emitted these files, the user can emit the marks scored by each student for all the schools for newly conducted exam’s.
Nifi will process the data based on the semester_exam_mst & semester_exam_qst_mst and the data file to aggregation tables and S3 output files.