Back to Blog

AI Collaboration for Data Engineering: Building Watson

How AI collaboration saved ~60 hours building a vulnerability management pipeline — with time savings breakdown and estimated financial impact based on hourly rates

AIData EngineeringProductivityBigQueryROI

The Request

The Information Security Governance (IS-GOV) team at MercadoLibre had an existing application called Watson — a vulnerability management system tracking security vulnerabilities across our IT assets. The application worked, but the data was locked in a MySQL database without analytical capabilities.

The ask: Build a data pipeline that makes Watson data available for self-service analysis in Looker Studio.

The scope:

  • 11 source tables from MySQL
  • Bronze → Silver → Gold medallion architecture
  • One analytics-ready data mart for executive dashboards
  • Daily automated refresh

This post documents how I used AI collaboration to accelerate the development — and where human judgment was still essential.


What I Actually Built

Project Structure

Watson Vulnerability Management Pipeline
├── Bronze Layer (STG) — 11 raw extraction tables
├── Silver Layer (TBL)
│   ├── Dimensions (LK_) — 5 tables
│   │   ├── LK_WATSON_ASSETS
│   │   ├── LK_WATSON_OFFICES
│   │   ├── LK_WATSON_PLUGINS
│   │   ├── LK_WATSON_SCANNERS
│   │   └── LK_WATSON_TENABLE_DETAILS
│   └── Facts/Bridge (BT_) — 6 tables
│       ├── BT_WATSON_VULNERABILITIES
│       ├── BT_WATSON_VULNERABILITY_TICKETS
│       ├── BT_WATSON_SCAN_HISTORY
│       ├── BT_WATSON_SCANS
│       ├── BT_WATSON_SCANNED_ASSETS
│       └── BT_WATSON_TICKET_VULNERABILITY_RELATIONS
└── Gold Layer (TBL) — 1 data mart
    └── DM_WATSON_CENTRAL_METRICS (68 columns)

The Gold Layer Data Mart

DM_WATSON_CENTRAL_METRICS is a wide table (68 columns) optimized for Looker Studio analysis. One row per vulnerability with all dimensions and metrics pre-calculated:

Key calculated fields:

  • DM_VUL_RISK_SCORE — Numeric risk score (Critical=100, High=75, Medium=50, Low=25)
  • DM_VUL_PRIORITY_DESC — Priority based on risk + age (e.g., “Critical Overdue”, “High Active”)
  • DM_VUL_AGE_DESC — Age categories (New ≤30 days, Recent 31-90 days, Aging 3-12 months, Old >1 year)
  • DM_TKT_SLA_COMPLIANCE_FLAG — Ticket SLA compliance tracking
  • DM_COUNTRY_OFFICE_DESC — Geographic analysis by region and office

Example query for the security team:

SELECT 
    OFF_REGION_NAME,
    COUNT(*) as total_vulnerabilities,
    SUM(DM_VUL_HIGH_SEVERITY_FLAG) as high_severity_count,
    AVG(DM_VUL_AGE_DAYS) as avg_age_days
FROM TBL.DM_WATSON_CENTRAL_METRICS
WHERE VUL_STATE_STATUS = 'Active'
GROUP BY OFF_REGION_NAME
ORDER BY high_severity_count DESC;

Time Savings Breakdown

Here’s my estimate of time saved on this project, task by task:

TaskWithout AIWith AITime Saved
Schema design (16 tables)~40 hours~10 hours30 hours
Transformation SQL (430+ lines)~24 hours~8 hours16 hours
Documentation (data dictionary, architecture)~16 hours~4 hours12 hours
Testing & validation~16 hours~14 hours2 hours
Total~96 hours~36 hours~60 hours

Estimated time savings: ~60 hours (62% reduction)

What This Means Financially

Using conservative estimates for senior data engineer hourly rates:

  • At $75/hour: 60 hours saved = ~$4,500 per project
  • At $100/hour: 60 hours saved = ~$6,000 per project

Note: These are rough estimates based on typical market rates. Your actual value will depend on your hourly rate and how much of your work involves repetitive tasks like schema generation and documentation.

The real value isn’t the dollar amount — it’s the 60 hours I could spend on higher-value work instead of writing boilerplate SQL and data dictionaries.


Where AI Collaboration Helped

1. Schema Generation (~75% time savings)

The Watson MySQL database had 11 tables with different field naming conventions. I needed BigQuery-compatible schemas with:

  • Proper typing (INT64, STRING, DATE, BOOLEAN)
  • Enterprise naming conventions (prefixes like AST_, VUL_, OFF_)
  • Bilingual documentation (Spanish + English)

Without AI: Manually mapping each field, writing JSON schemas, documenting each column. Estimated 2-3 hours per table × 16 tables = ~40 hours.

With AI: I described the requirements and source structure. AI generated schema drafts that I reviewed and refined. About 30-40 minutes per table × 16 tables = ~10 hours.

2. Transformation Logic (~67% time savings)

The gold layer SQL has 430+ lines with multiple CTEs:

  • latest_scan_per_asset — Gets the most recent scan per asset using ROW_NUMBER()
  • vulnerability_base — Core vulnerability data with risk scoring
  • latest_ticket_per_vulnerability — Gets the most relevant ticket per vulnerability
  • vulnerability_with_tickets — Joins everything together

AI helped with:

  • Boilerplate patterns (SAFE_CAST, NULLIF, DATE_DIFF)
  • CASE statements for categorization
  • JOIN strategy suggestions

I still owned:

  • Business logic (what counts as “overdue”?)
  • Deduplication strategy (which scan/ticket wins?)
  • Data quality decisions (how to handle nulls?)

3. Documentation (~75% time savings)

The project documentation includes:

  • Complete data dictionary (68 columns documented)
  • Architecture diagrams
  • Example queries for common use cases
  • Operations and maintenance guide

AI accelerated the initial drafts. I added the business context and validated accuracy.

4. Testing & Validation (~12% time savings)

This is where AI helps least. Testing still requires:

  • Understanding expected data patterns
  • Validating business logic against requirements
  • Checking edge cases from real data

AI can suggest test queries, but you still need to run them and interpret results.


Where Human Judgment Was Essential

Architecture Decisions

Why medallion architecture? The security team needed:

  • Bronze: Raw data preservation for audit trails
  • Silver: Type-safe, business-ready dimensions and facts
  • Gold: Pre-aggregated metrics for fast dashboard queries

AI could generate SQL, but couldn’t make this decision.

Business Context

The IS-GOV team tracks SLA compliance for vulnerability remediation. Different risk levels have different SLA windows:

  • Critical: 7 days
  • High: 30 days
  • Medium: 90 days

I learned this from conversations with the security team, not from AI.

Data Quality

The source data had issues:

  • Risk level variations (“CRITICAL” vs “Critical” vs “critical”)
  • Special characters in NetBIOS names
  • Missing office assignments for some assets

I added normalization logic:

CASE
  WHEN UPPER(TRIM(v.VUL_RISK_LEVEL)) IN ('CRITICAL') THEN 'Critical'
  WHEN UPPER(TRIM(v.VUL_RISK_LEVEL)) IN ('HIGH') THEN 'High'
  WHEN UPPER(TRIM(v.VUL_RISK_LEVEL)) IN ('MEDIUM') THEN 'Medium'
  WHEN UPPER(TRIM(v.VUL_RISK_LEVEL)) IN ('LOW', 'LOWER') THEN 'Low'
  ELSE 'Unknown'
END as VUL_RISK_LEVEL

These decisions required understanding the data and the team’s needs.


What I Learned

AI Accelerates, Doesn’t Replace

For repetitive tasks (schema generation, boilerplate SQL, documentation drafts), AI saved significant time. But the work that matters — understanding business requirements, making design decisions, validating quality — still required me.

Review Everything

AI-generated SQL isn’t always correct. I caught issues with:

  • JOIN conditions that would cause row explosion
  • Missing NULL handling
  • Incorrect date parsing formats

Every AI output got reviewed before use.

Start Small

I didn’t build the whole pipeline with AI. I started with schema documentation, saw it worked well, then expanded to transformation SQL, then to data dictionary generation.


The Result

The Watson pipeline runs daily at 00:00 UTC. The security team now has:

  • Self-service access to vulnerability analytics in Looker Studio
  • Executive dashboards showing risk by region, age, and SLA compliance
  • Operational reports for ticket tracking and remediation progress

Technical details:

  • Platform: BigQuery + DataFlow
  • Owner: IS-GOV Data Engineering (Role ID: 10427)
  • Refresh: Daily full refresh (TRUNCATE + INSERT)
  • Partitioning: Monthly by scan start date
  • Clustering: Office region for geographic queries

Try This Yourself

If you’re curious about AI collaboration for data engineering:

  1. Pick a well-defined task — Schema documentation, boilerplate SQL, or data dictionary generation work well
  2. Be specific — “Generate a BigQuery schema for vulnerability tracking with fields X, Y, Z” beats “help me with a schema”
  3. Review everything — AI output is a starting point, not a final answer
  4. Notice the gaps — Where did you need to add business context? Those are the places where your judgment matters most

The Bottom Line

For this project, AI collaboration saved approximately 60 hours — time I could redirect to architecture decisions, stakeholder alignment, and higher-value work.

Estimated financial impact: $4,500-6,000 per project (based on $75-100/hour senior data engineer rates).

The methodology: I tracked time estimates for each task category, compared AI-assisted vs. manual approaches, and applied market hourly rates. These are estimates, not precise measurements — your actual savings will depend on how much of your work involves repetitive tasks like schema generation, boilerplate SQL, and documentation.

Key insight: AI accelerates execution on well-defined tasks. Humans still own strategy, business context, and quality validation.


This post reflects my experience building the Watson vulnerability management pipeline at MercadoLibre’s IS-GOV team. Time estimates are based on my experience with similar projects.