Sunbird cQube
  • 📖KNOW ABOUT CQUBE
    • What is cQube & what does it solve
      • Business case
      • cQube ed
      • Design Principles
    • cQube adoptions
    • Discuss more about cQube
  • 👍TRY CQUBE
    • cQube on Gitpod
  • 🌅WHAT IS NEXT IN CQUBE
    • cQube Roadmap
  • 💻TECHANICAL OVERVIEW
    • Architecture
    • Design Principles
    • Key Components
    • Details of microservices
  • 👩‍💻Get started on cQube
    • Suggested Team Structure
    • Hardware Requirements
    • Prerequisites Checklist
    • Checking pre existing ports
    • Instance Creation
    • Copying SSL Certificate
  • 🛃USE CQUBE
    • How can I install cQube?
      • High level understanding of deployment
      • Oracle Installation
      • AWS Installation
      • SDC Installation
      • Azure Installation
    • How to prepare the data
      • Schema
      • How to prepare schemas for dimension files
      • How to prepare schemas for event files
      • Examples of dimension & event files
      • All cQube schemas used for VSK
    • Ingesting the data
      • High-level understanding of ingestion
      • Steps to ingest schema
      • Steps to ingest data files
        • Upload the .csv data file for state specific programs using ingestion API
        • API to upload starter pack data files for NVSK programs
        • Get file status API
        • Scheduled API
      • Error during ingestion
        • Error Monitoring
        • Common errors in data files during ingestion
      • Processor group name
    • Adapter details
    • Postman details
    • Processing of data
      • Data Processing using CLI command
      • API Details for Nifi-Rest
      • Nifi section
    • Visualizing the data
      • High level understanding of how visualizations work in cQube
      • Programs and reports out-of-the-box
      • Enhance /Customize cQube
        • Available customizations
          • Changing Dashboard Logos and Headers
          • Changing Program Name, Icon and Side Menu Sequence
          • Adding a new KPI
          • Adding a Map KPI into dashboard ms
          • Table Drill Down Customization
          • Adding a Scatter Plot KPI into dashboard ms
          • Configure default date range across app/specific report
        • How to add a New Indicator
        • How to add a new report in an existing program
        • How to add a new program (end to end)
    • Additional Features
      • Public/Private dashboards
      • Role based access control
      • Saving geographical preferences
      • Admin Panel
        • Data Debugger
        • Schema Generator
        • System Monitoring
    • Adding Users
      • Adding an individual user
      • Adding bulk users
  • 🖥️MONITOR cQUBE
    • Infra health monitoring
    • Usage monitoring
  • 🔎QA testing
    • Testing approaches & activities
    • Manual & Automated testing
    • Functional Testing
      • Smoke Testing
      • Functional tests
      • Regression Testing
      • System Testing
    • Non Functional Testing
      • Performance Testing
        • Load Testing
        • Volume Testing
        • Performance testing results
    • Test for One-Step Installation
    • Test for Ingestion
    • Test for nifi processing
    • Test for UI Application
    • Test for KPIs
  • ☀️DEPLOYMENT PROCESS
    • State List
    • AWS Deployment
    • SDC Deployment
    • Adapter Details During the Processing
  • 🈴UPGRADING TO LATEST VERSION
    • How can I upgrade cQube to the latest release
  • 🆘Common issues and their solutions
    • Deployment & ingestion related issues & their solutions
  • ⏱️Standard Operating Procedure
    • Reporting a Bug
    • Protocol for issue reporting & resolution
    • Suggesting Enhancements
    • Raising a PR
  • ❓Frequently Asked Questions
    • Running List
  • 🧑‍🏫🧑🏫 Recording of trainings
    • Link to the training videos
  • 🧠Key Terms & Concepts
    • Definitions
  • 🚀cQube Release Notes
    • cQube - Release V 5.0.5
    • cQube - Release V 5.0.3
    • cQube - Release V 5.0.2
    • cQube - Release V 5.0.1
    • cQube - Release V 5.0
    • cQube - Release V 4.1-beta
    • cQube - Release V 4.0-beta
    • cQube - Release V 4.0-alpha
    • cQube - Release V 3.7
    • cQube - Release V 3.6
    • cQube - Release V 3.5
    • cQube - Release V 3.4
    • cQube - Release V 3.3
    • cQube - Release V 3.2
    • cQube - Release V 3.1
    • cQube - Release V 3
    • cQube - Release V 2
    • cQube - Release V 1.13 and V 1.13.1
    • cQube - Release V 1.12 and V 1.12.1
    • cQube - Release V 1.11
    • cQube - Release V 10 and V 10.1
    • cQube - Release V 1.9
    • cQube - Release V 1.8 and V 1.8.1
    • cQube - Release Notes V 1.7
    • cQube - Release Notes V 1.6 and V 1.6.1
    • cQube - Release Notes V 1.5
    • cQube - Release Notes V 1.4
    • cQube - Release Notes V 1.3
    • cQube - Release Notes V 1.2 and V 1.2.1
    • cQube - Release Notes V 1.1
    • cQube - Release Notes V 1.0
  • 📂cQube V 4.1 - Beta
    • Sunbird cQube Overview
    • cQube Product Description
    • Listen to Experts (Youtube)
    • Software Requirements
    • Acronyms
    • cQube Software Architecture
    • AWS - Network Architecture
      • Hardware requirements
      • Data Storage Locations
    • Security Implementations
    • Prerequisites for Installation process
    • New Use-Case Creation
    • cQube Setup & configuration
    • Base Installation steps
    • Base Upgradation steps
    • Workflow Installation steps
    • Workflow Upgradation steps
    • Laptop/Desktop Installation
      • Base Installation
      • Workflow Installation
      • Mock Data Processing
    • Ad-hoc analysis
    • Workflow process
    • Emission Process
    • cQube ER Diagrams
    • Data Validation after Ingestion
    • User Authentication Process
    • Admin Login Process
    • Admin Features
    • cQube Datasource Configuration
    • cQube data replay process
    • S3 Partitioning
    • Reports
    • Troubleshooting Issues
      • Data Processing-NIFI Issues
      • Data Processing-PostgreSQL Issues
      • Data Emission Issues
      • Angular & Node Issues
    • FAQs
    • Discuss
    • Report
    • Source Code
Powered by GitBook
On this page

Was this helpful?

Edit on GitHub
  1. cQube V 4.1 - Beta

Data Validation after Ingestion

NIFI fetches the data from the emission data storage and sends it to the input data storage. Once the data is sent to input data storage the following validations are performed:

  • Column level validations: Column datatype mismatch, Number of columns.

  • Improper Data handling: Missing/null data values for mandatory fields, Empty data files, Special characters, Blank lines in data files.

  • Duplicate records validation: NIFI validates the duplicate records by grouping the same kind of records together.

Validation for duplicate records: Records which have duplicate values for all fields (mirror record) then NIFI will consider the first record and the rest of the records will be eliminated.

Records with duplicate ID: For the rest of the duplicate records where the records are having the same ID (the primary/composite key like student ID/ assessment ID/ infra ID/ CRC visit ID) and different values for rest of the columns will not be inserted into the database tables as ID is the primary key.

Example: For the Duplicate records with different Lat Long details for school_master data file, NIFI eliminates the records which have the same school_id with different Lat Long details or different names or different values.

Records with same values: For semester report, The records which are having the same values for fields Student ID, School ID, semester, studying class and different values for the subjects then NIFI will eliminate those records.

Overlapping data validation: Overlapping data validation takes place based on the data source.

  • The NIFI process for student attendance reports will check the last updated day’s record from the transaction table and will process the records from the day after the last updated date. The records from all of the previous days will not be considered for NIFI processing.

  • For the other data sources, duplicate records where the records are having the same ID (student ID/ assessment ID/ infra ID/ CRC visit ID) and different values will not be inserted into the database tables as ID is the primary key.

Other data issues: Data handling in cases like job failures, missing data for certain days and late receipt of the data (receiving data after a few days), updating the wrong data, upon request (when issue identified at the report)

PreviouscQube ER DiagramsNextUser Authentication Process

Last updated 2 years ago

Was this helpful?

📂