cQube data replay process
The data replay process takes place based on the data source. Mentioned below are the steps that are involved in the data replay process:
● Admin will be provided with a screen to select the options to clear the data for each of the data sources. The admin screen will contain the following selection options:
For student attendance, Teacher attendance the admin will be able to select the ‘year and month’ using the year & month drop down.
For CRC and Diksha summary rollup the admin will have the calendar selection. The data will be deleted for the selected dates.
For the Semester reports, admin will be able to select the required semester from the available semesters. The selected semester data will be deleted.
For Periodic Assessment Test, admin selects the exam code option from the multiple select box which is having all the available exam codes. The complete data which is related to the selected exam code will be deleted.
For Diksha TPD, Admin selects the Batch ID option from the select box which is having all the available Batch IDs. The complete data which is related to the selected Batch ID will be deleted.
For UDISE & Infrastructure data sources, admin can delete overall data with the selection of ‘Yes or No’ option from the select box. Full refresh will happen with the new data.
For the static data sources, admin can delete overall data with the selection of ‘Yes or No’ option from the select box. Full refresh will happen with the new data.
● A submit and Reset all buttons will be given in the admin screen to Submit the request and reset the options. ● When admin clicks on submit button, All the data sources will be created as JSON file as shown below
● The JSON file containing the values selected by the admin will be placed in the S3 emission bucket. ● A scheduler will be created for the data replay process for all reports. And the scheduler will run based on the schedule defined by the admin. ● The scheduler will initiate the NIFI to get the file from S3 input bucket. NIFI performs the data deletion operation based on the inputs given by the admin (for all the data sources).
Data deletion process
Once the file is emitted to the S3 bucket, NIFI function will be invoked at the scheduled time and get the input parameters from the JSON file. The queries will be executed and delete the data from transaction tables. Once the workflow is run the output files will be updated according to the deleted data.
Data reprocessing (for previously deleted data) flow
Data reprocessing will take place in the normal cQube emission process.
● The latest data file will be emitted to S3 emission bucket
● The file will be processed as the regular data process from NIFI All the validations will be performed by NIFI and the validated data will be inserted into the transaction tables.
● All the metrics will be re-calculated and updates of the output files.
● The new metrics will be affected in the reports.
The complete workflow process will be like below.
List of tables cleared for the data source
datasource | parameter | list of tables | function call |
student_attendance | month,year | student_attendance_meta,student_attendance_staging_1,student_attendance_staging_2,student_attendance_trans,school_student_total_attedance | select del_data(p_data_source=>'student_attendance',p_year=>2022,VARIADIC p_month=>array[1,2]); |
teacher_attendance | month,year | teacher_attendance_meta,teacher_attendance_staging_1,teacher_attendance_staging_1,teacher_attendance_temp,teacher_attendance_trans,school_teacher_total_attendance | select del_data(p_data_source=>'teacher_attendance',p_year=>2022,VARIADIC p_month=>array[1,2]); |
crc | month,year | crc_location_trans,crc_inspection_trans,crc_visits_frequency | select del_data(p_data_source=>'crc',p_year=>2022,VARIADIC p_month=>array[1,2]); |
semester_assessment_test | exam_code/semester | semester_exam_mst,semester_exam_result_staging_2,semester_exam_school_qst_result,semester_exam_result_temp,semester_exam_school_result,semester_exam_qst_mst,semester_exam_result_staging_1,semester_exam_result_trans | select pat_del_data(p_data_source=>'periodic_assessment_test',VARIADIC p_exam_code=>array['PAT0302290720201','PAT0302290720202']); |
periodic_assessment_test | exam_code | periodic_exam_mst,periodic_exam_result_staging_2,periodic_exam_school_qst_result,periodic_exam_result_temp,periodic_exam_school_result,periodic_exam_qst_mst,periodic_exam_result_staging_1,periodic_exam_result_trans | select pat_del_data(p_data_source=>'periodic_assessment_test',VARIADIC p_exam_code=>array['PAT0302290720201','PAT0302290720202']); |
diksha_tpd | batch_id | diksha_tpd_agg,diksha_tpd_trans,diksha_tpd_content_temp,diksha_tpd_staging | select diksha_tpd_del_data(p_data_source=>'diksha_tpd',VARIADIC p_batch_id =>array['0302290720201','0302290720202']); |
diksha_summary_rollup | from_date,to_date | diksha_content_staging,diksha_content_temp,diksha_content_trans,diksha_total_content | select diksha_summary_rollup_del_data('diksh a_summary_rollup','2022-12-27','2022-1 2-31'); |
infrastructure | all | infrastructure_temp,infrastructure_trans | select all_del_data('infrastructure'); |
static | all | block_tmp,block_mst,district_tmp,district_mst,cluster_tmp,cluster_mst,school_master,school_tmp,school_hierarchy_details,school_geo_master | select all_del_data('static'); |
udise | all | udise_sch_incen_cwsn,udise_nsqf_plcmnt_c12 udise_sch_enr_reptr,udise_nsqf_basic_info,udise_sch_incentives,udise_nsqf_trng_prov,udise_sch_exmmarks_c10, udise_nsqf_class_cond,udise_school_metrics_trans,udise_sch_exmmarks_c12 udise_sch_pgi_details,udise_nsqf_enr_caste, udise_sch_enr_age,udise_sch_exmres_c10,udise_sch_profile,udise_nsqf_enr_sub_sec,udise_sch_enr_by_stream, udise_sch_exmres_c12,udise_sch_recp_exp,udise_nsqf_exmres_c10,udise_sch_enr_cwsn,udise_sch_exmres_c5,udise_sch_safety, udise_nsqf_exmres_c12,udise_sch_enr_fresh, udise_sch_exmres_c8,udise_sch_staff_posn,udise_nsqf_faculty,udise_sch_enr_medinstr, udise_sch_facility,udise_tch_profile,udise_nsqf_plcmnt_c10,udise_sch_enr_newadm | select all_del_data('udise'); |
Last updated