I work at the boundary between data engineering and infrastructure —
the part where pipelines need to be reliable, cheap to run, and
observable by the people downstream from them.
At Making Science I led the build-out of a multi-channel marketing
platform pulling from 15+ ad APIs into per-client BigQuery projects.
The architecture was hub-and-spoke: a shared ingestion layer feeding
isolated customer warehouses, orchestrated by Airflow on Cloud
Composer, transformed through layered dbt models. The interesting
problems were rarely about ingestion itself — they were about cost,
tenancy, and standing up new customers without engineering toil.
The Terraform module library I built brought new-pipeline deployment
from an hour to roughly fifteen minutes.
Before that I worked on geospatial ML pipelines at LiveEO
(satellite imagery, Anyscale, spot EC2), and AWS warehousing at
Ryan-Miranda. My early years were spent building a low-code data
platform on Flask, Nifi, and a stack of custom connectors —
which is where I learned that the boring parts of data work
(auth, retries, schema drift, idempotency) are usually the ones
that matter most.
note
Currently studying for the GCP Associate Data Practitioner cert
and open to senior data engineering roles — remote, or relocation
to the right team.