cQube data replay process
The data replay process takes place based on the data source. Mentioned below are the steps that are involved in the data replay process:
● Admin will be provided with a screen to select the options to clear the data for each of the data sources. The admin screen will contain the following selection options:
For student attendance, Teacher attendance the admin will be able to select the ‘year and month’ using the year & month drop down.
For CRC and Diksha summary rollup the admin will have the calendar selection. The data will be deleted for the selected dates.
For the Semester reports, admin will be able to select the required semester from the available semesters. The selected semester data will be deleted.
For Periodic Assessment Test, admin selects the exam code option from the multiple select box which is having all the available exam codes. The complete data which is related to the selected exam code will be deleted.
For Diksha TPD, Admin selects the Batch ID option from the select box which is having all the available Batch IDs. The complete data which is related to the selected Batch ID will be deleted.
For UDISE & Infrastructure data sources, admin can delete overall data with the selection of ‘Yes or No’ option from the select box. Full refresh will happen with the new data.
For the static data sources, admin can delete overall data with the selection of ‘Yes or No’ option from the select box. Full refresh will happen with the new data.
● A submit and Reset all buttons will be given in the admin screen to Submit the request and reset the options. ● When admin clicks on submit button, All the data sources will be created as JSON file as shown below
● The JSON file containing the values selected by the admin will be placed in the S3 emission bucket. ● A scheduler will be created for the data replay process for all reports. And the scheduler will run based on the schedule defined by the admin. ● The scheduler will initiate the NIFI to get the file from S3 input bucket. NIFI performs the data deletion operation based on the inputs given by the admin (for all the data sources).
Data deletion process
Once the file is emitted to the S3 bucket, NIFI function will be invoked at the scheduled time and get the input parameters from the JSON file. The queries will be executed and delete the data from transaction tables. Once the workflow is run the output files will be updated according to the deleted data.
Data reprocessing (for previously deleted data) flow
Data reprocessing will take place in the normal cQube emission process.
● The latest data file will be emitted to S3 emission bucket
● The file will be processed as the regular data process from NIFI All the validations will be performed by NIFI and the validated data will be inserted into the transaction tables.
● All the metrics will be re-calculated and updates of the output files.
● The new metrics will be affected in the reports.
The complete workflow process will be like below.
List of tables cleared for the data source
datasource
parameter
list of tables
function call
student_attendance
month,year
student_attendance_meta,student_attendance_staging_1,student_attendance_staging_2,student_attendance_trans,school_student_total_attedance
select del_data(p_data_source=>'student_attendance',p_year=>2022,VARIADIC p_month=>array[1,2]);
teacher_attendance
month,year
teacher_attendance_meta,teacher_attendance_staging_1,teacher_attendance_staging_1,teacher_attendance_temp,teacher_attendance_trans,school_teacher_total_attendance
select del_data(p_data_source=>'teacher_attendance',p_year=>2022,VARIADIC p_month=>array[1,2]);
crc
month,year
crc_location_trans,crc_inspection_trans,crc_visits_frequency
select del_data(p_data_source=>'crc',p_year=>2022,VARIADIC p_month=>array[1,2]);
semester_assessment_test
exam_code/semester
semester_exam_mst,semester_exam_result_staging_2,semester_exam_school_qst_result,semester_exam_result_temp,semester_exam_school_result,semester_exam_qst_mst,semester_exam_result_staging_1,semester_exam_result_trans
select pat_del_data(p_data_source=>'periodic_assessment_test',VARIADIC p_exam_code=>array['PAT0302290720201','PAT0302290720202']);
periodic_assessment_test
exam_code
periodic_exam_mst,periodic_exam_result_staging_2,periodic_exam_school_qst_result,periodic_exam_result_temp,periodic_exam_school_result,periodic_exam_qst_mst,periodic_exam_result_staging_1,periodic_exam_result_trans
select pat_del_data(p_data_source=>'periodic_assessment_test',VARIADIC p_exam_code=>array['PAT0302290720201','PAT0302290720202']);
diksha_tpd
batch_id
diksha_tpd_agg,diksha_tpd_trans,diksha_tpd_content_temp,diksha_tpd_staging
select diksha_tpd_del_data(p_data_source=>'diksha_tpd',VARIADIC p_batch_id =>array['0302290720201','0302290720202']);
diksha_summary_rollup
from_date,to_date
diksha_content_staging,diksha_content_temp,diksha_content_trans,diksha_total_content
select diksha_summary_rollup_del_data('diksh a_summary_rollup','2022-12-27','2022-1 2-31');
infrastructure
all
infrastructure_temp,infrastructure_trans
select all_del_data('infrastructure');
static
all
block_tmp,block_mst,district_tmp,district_mst,cluster_tmp,cluster_mst,school_master,school_tmp,school_hierarchy_details,school_geo_master
select all_del_data('static');
udise
all
udise_sch_incen_cwsn,udise_nsqf_plcmnt_c12 udise_sch_enr_reptr,udise_nsqf_basic_info,udise_sch_incentives,udise_nsqf_trng_prov,udise_sch_exmmarks_c10, udise_nsqf_class_cond,udise_school_metrics_trans,udise_sch_exmmarks_c12 udise_sch_pgi_details,udise_nsqf_enr_caste, udise_sch_enr_age,udise_sch_exmres_c10,udise_sch_profile,udise_nsqf_enr_sub_sec,udise_sch_enr_by_stream, udise_sch_exmres_c12,udise_sch_recp_exp,udise_nsqf_exmres_c10,udise_sch_enr_cwsn,udise_sch_exmres_c5,udise_sch_safety, udise_nsqf_exmres_c12,udise_sch_enr_fresh, udise_sch_exmres_c8,udise_sch_staff_posn,udise_nsqf_faculty,udise_sch_enr_medinstr, udise_sch_facility,udise_tch_profile,udise_nsqf_plcmnt_c10,udise_sch_enr_newadm
select all_del_data('udise');
Last updated