AI Collaboration for Data Engineering: Building Watson
How AI collaboration saved ~60 hours building a vulnerability management pipeline — with time savings breakdown and estimated financial impact based on hourly rates
The Request
The Information Security Governance (IS-GOV) team at MercadoLibre had an existing application called Watson — a vulnerability management system tracking security vulnerabilities across our IT assets. The application worked, but the data was locked in a MySQL database without analytical capabilities.
The ask: Build a data pipeline that makes Watson data available for self-service analysis in Looker Studio.
The scope:
- 11 source tables from MySQL
- Bronze → Silver → Gold medallion architecture
- One analytics-ready data mart for executive dashboards
- Daily automated refresh
This post documents how I used AI collaboration to accelerate the development — and where human judgment was still essential.
What I Actually Built
Project Structure
Watson Vulnerability Management Pipeline
├── Bronze Layer (STG) — 11 raw extraction tables
├── Silver Layer (TBL)
│ ├── Dimensions (LK_) — 5 tables
│ │ ├── LK_WATSON_ASSETS
│ │ ├── LK_WATSON_OFFICES
│ │ ├── LK_WATSON_PLUGINS
│ │ ├── LK_WATSON_SCANNERS
│ │ └── LK_WATSON_TENABLE_DETAILS
│ └── Facts/Bridge (BT_) — 6 tables
│ ├── BT_WATSON_VULNERABILITIES
│ ├── BT_WATSON_VULNERABILITY_TICKETS
│ ├── BT_WATSON_SCAN_HISTORY
│ ├── BT_WATSON_SCANS
│ ├── BT_WATSON_SCANNED_ASSETS
│ └── BT_WATSON_TICKET_VULNERABILITY_RELATIONS
└── Gold Layer (TBL) — 1 data mart
└── DM_WATSON_CENTRAL_METRICS (68 columns)
The Gold Layer Data Mart
DM_WATSON_CENTRAL_METRICS is a wide table (68 columns) optimized for Looker Studio analysis. One row per vulnerability with all dimensions and metrics pre-calculated:
Key calculated fields:
DM_VUL_RISK_SCORE— Numeric risk score (Critical=100, High=75, Medium=50, Low=25)DM_VUL_PRIORITY_DESC— Priority based on risk + age (e.g., “Critical Overdue”, “High Active”)DM_VUL_AGE_DESC— Age categories (New ≤30 days, Recent 31-90 days, Aging 3-12 months, Old >1 year)DM_TKT_SLA_COMPLIANCE_FLAG— Ticket SLA compliance trackingDM_COUNTRY_OFFICE_DESC— Geographic analysis by region and office
Example query for the security team:
SELECT
OFF_REGION_NAME,
COUNT(*) as total_vulnerabilities,
SUM(DM_VUL_HIGH_SEVERITY_FLAG) as high_severity_count,
AVG(DM_VUL_AGE_DAYS) as avg_age_days
FROM TBL.DM_WATSON_CENTRAL_METRICS
WHERE VUL_STATE_STATUS = 'Active'
GROUP BY OFF_REGION_NAME
ORDER BY high_severity_count DESC;
Time Savings Breakdown
Here’s my estimate of time saved on this project, task by task:
| Task | Without AI | With AI | Time Saved |
|---|---|---|---|
| Schema design (16 tables) | ~40 hours | ~10 hours | 30 hours |
| Transformation SQL (430+ lines) | ~24 hours | ~8 hours | 16 hours |
| Documentation (data dictionary, architecture) | ~16 hours | ~4 hours | 12 hours |
| Testing & validation | ~16 hours | ~14 hours | 2 hours |
| Total | ~96 hours | ~36 hours | ~60 hours |
Estimated time savings: ~60 hours (62% reduction)
What This Means Financially
Using conservative estimates for senior data engineer hourly rates:
- At $75/hour: 60 hours saved = ~$4,500 per project
- At $100/hour: 60 hours saved = ~$6,000 per project
Note: These are rough estimates based on typical market rates. Your actual value will depend on your hourly rate and how much of your work involves repetitive tasks like schema generation and documentation.
The real value isn’t the dollar amount — it’s the 60 hours I could spend on higher-value work instead of writing boilerplate SQL and data dictionaries.
Where AI Collaboration Helped
1. Schema Generation (~75% time savings)
The Watson MySQL database had 11 tables with different field naming conventions. I needed BigQuery-compatible schemas with:
- Proper typing (INT64, STRING, DATE, BOOLEAN)
- Enterprise naming conventions (prefixes like AST_, VUL_, OFF_)
- Bilingual documentation (Spanish + English)
Without AI: Manually mapping each field, writing JSON schemas, documenting each column. Estimated 2-3 hours per table × 16 tables = ~40 hours.
With AI: I described the requirements and source structure. AI generated schema drafts that I reviewed and refined. About 30-40 minutes per table × 16 tables = ~10 hours.
2. Transformation Logic (~67% time savings)
The gold layer SQL has 430+ lines with multiple CTEs:
latest_scan_per_asset— Gets the most recent scan per asset using ROW_NUMBER()vulnerability_base— Core vulnerability data with risk scoringlatest_ticket_per_vulnerability— Gets the most relevant ticket per vulnerabilityvulnerability_with_tickets— Joins everything together
AI helped with:
- Boilerplate patterns (SAFE_CAST, NULLIF, DATE_DIFF)
- CASE statements for categorization
- JOIN strategy suggestions
I still owned:
- Business logic (what counts as “overdue”?)
- Deduplication strategy (which scan/ticket wins?)
- Data quality decisions (how to handle nulls?)
3. Documentation (~75% time savings)
The project documentation includes:
- Complete data dictionary (68 columns documented)
- Architecture diagrams
- Example queries for common use cases
- Operations and maintenance guide
AI accelerated the initial drafts. I added the business context and validated accuracy.
4. Testing & Validation (~12% time savings)
This is where AI helps least. Testing still requires:
- Understanding expected data patterns
- Validating business logic against requirements
- Checking edge cases from real data
AI can suggest test queries, but you still need to run them and interpret results.
Where Human Judgment Was Essential
Architecture Decisions
Why medallion architecture? The security team needed:
- Bronze: Raw data preservation for audit trails
- Silver: Type-safe, business-ready dimensions and facts
- Gold: Pre-aggregated metrics for fast dashboard queries
AI could generate SQL, but couldn’t make this decision.
Business Context
The IS-GOV team tracks SLA compliance for vulnerability remediation. Different risk levels have different SLA windows:
- Critical: 7 days
- High: 30 days
- Medium: 90 days
I learned this from conversations with the security team, not from AI.
Data Quality
The source data had issues:
- Risk level variations (“CRITICAL” vs “Critical” vs “critical”)
- Special characters in NetBIOS names
- Missing office assignments for some assets
I added normalization logic:
CASE
WHEN UPPER(TRIM(v.VUL_RISK_LEVEL)) IN ('CRITICAL') THEN 'Critical'
WHEN UPPER(TRIM(v.VUL_RISK_LEVEL)) IN ('HIGH') THEN 'High'
WHEN UPPER(TRIM(v.VUL_RISK_LEVEL)) IN ('MEDIUM') THEN 'Medium'
WHEN UPPER(TRIM(v.VUL_RISK_LEVEL)) IN ('LOW', 'LOWER') THEN 'Low'
ELSE 'Unknown'
END as VUL_RISK_LEVEL
These decisions required understanding the data and the team’s needs.
What I Learned
AI Accelerates, Doesn’t Replace
For repetitive tasks (schema generation, boilerplate SQL, documentation drafts), AI saved significant time. But the work that matters — understanding business requirements, making design decisions, validating quality — still required me.
Review Everything
AI-generated SQL isn’t always correct. I caught issues with:
- JOIN conditions that would cause row explosion
- Missing NULL handling
- Incorrect date parsing formats
Every AI output got reviewed before use.
Start Small
I didn’t build the whole pipeline with AI. I started with schema documentation, saw it worked well, then expanded to transformation SQL, then to data dictionary generation.
The Result
The Watson pipeline runs daily at 00:00 UTC. The security team now has:
- Self-service access to vulnerability analytics in Looker Studio
- Executive dashboards showing risk by region, age, and SLA compliance
- Operational reports for ticket tracking and remediation progress
Technical details:
- Platform: BigQuery + DataFlow
- Owner: IS-GOV Data Engineering (Role ID: 10427)
- Refresh: Daily full refresh (TRUNCATE + INSERT)
- Partitioning: Monthly by scan start date
- Clustering: Office region for geographic queries
Try This Yourself
If you’re curious about AI collaboration for data engineering:
- Pick a well-defined task — Schema documentation, boilerplate SQL, or data dictionary generation work well
- Be specific — “Generate a BigQuery schema for vulnerability tracking with fields X, Y, Z” beats “help me with a schema”
- Review everything — AI output is a starting point, not a final answer
- Notice the gaps — Where did you need to add business context? Those are the places where your judgment matters most
The Bottom Line
For this project, AI collaboration saved approximately 60 hours — time I could redirect to architecture decisions, stakeholder alignment, and higher-value work.
Estimated financial impact: $4,500-6,000 per project (based on $75-100/hour senior data engineer rates).
The methodology: I tracked time estimates for each task category, compared AI-assisted vs. manual approaches, and applied market hourly rates. These are estimates, not precise measurements — your actual savings will depend on how much of your work involves repetitive tasks like schema generation, boilerplate SQL, and documentation.
Key insight: AI accelerates execution on well-defined tasks. Humans still own strategy, business context, and quality validation.
This post reflects my experience building the Watson vulnerability management pipeline at MercadoLibre’s IS-GOV team. Time estimates are based on my experience with similar projects.