Getting Started
The Data Validator helps you validate Excel data files against custom YAML validation rules. Upload your data file and rules, then get instant validation feedback with detailed error reporting.
Quick Start
- Download the sample files below to understand the format
- Prepare your Excel data file with column headers
- Create a YAML rules file defining validation requirements
- Upload both files and click "Start Validation"
- Review results and export validated data
Excel File Format
Your Excel file should have a header row with column names, followed by data rows.
Requirements:
- First row must contain column headers
- Column names should match the
display_name in your YAML file
- Maximum file size: 10MB
- Supported formats: .xlsx, .xls
Example Excel Structure:
Employee Identifier | First Name | Last Name | Date of Birth | Department
EMP001 | John | Doe | 1985-05-15 | Engineering
EMP002 | Jane | Smith | 1990-08-22 | Marketing
Creating YAML Validation Rules
The YAML file defines validation rules for each column in your Excel file.
Basic Structure:
fields:
Employee_Identifier:
display_name: "Employee Identifier"
type: string
required: true
min_length: 5
max_length: 10
pattern: "^EMP[0-9]+$"
First_Name:
display_name: "First Name"
type: string
required: true
min_length: 2
max_length: 50
Date_of_Birth:
display_name: "Date of Birth"
type: date
required: true
Field Properties:
- display_name: The exact column header in your Excel file
- type: Data type -
string, number, date, email
- required:
true or false - whether the field must have a value
- min_length/max_length: For string fields, minimum and maximum character length
- min_value/max_value: For number fields, minimum and maximum values
- pattern: Regular expression pattern the value must match
- allowed_values: List of permitted values (enum)
Example with Allowed Values:
Department:
display_name: "Department"
type: string
required: true
allowed_values:
- "Engineering"
- "Marketing"
- "Sales"
- "HR"
- "Finance"
Number Field Example:
Salary:
display_name: "Annual Salary"
type: number
required: true
min_value: 30000
max_value: 500000
Important: The field name (e.g., Employee_Identifier) is the internal key. The display_name must exactly match your Excel column header, including spaces and capitalization.
Creating Your Own YAML File:
- List all column headers from your Excel file
- For each column, create a field entry with an internal key (use underscores, no spaces)
- Set the
display_name to match your Excel column header exactly
- Choose appropriate
type: string, number, date, or email
- Set
required: true for mandatory fields, false for optional
- Add constraints like min_length, max_length, pattern, or allowed_values as needed
- Save the file with .yaml or .yml extension
Best Practices
- Start with sample data - test with 5-10 rows before validating large datasets
- Use clear, descriptive column names in your Excel file
- Add validation rules incrementally - start simple, then add more constraints
- Use patterns (regex) for structured data like employee IDs, phone numbers, etc.
- Set appropriate min/max values to catch data entry errors
- Use allowed_values for fields with limited options (departments, status, etc.)
- Review error messages carefully - they indicate exactly what failed validation
- Export validated results to track which records passed/failed
Common Patterns
Email Validation:
Email:
display_name: "Email Address"
type: string
required: true
pattern: "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"
Phone Number (Singapore):
Mobile:
display_name: "Mobile Number"
type: string
required: true
pattern: "^(\\+65|65)?[689][0-9]{7}$"
Postal Code:
Postal_Code:
display_name: "Postal Code"
type: string
required: true
pattern: "^[0-9]{6}$"
min_length: 6
max_length: 6