
Search by job, company or skills
Develop 15 data ingestion pipelines from heterogeneous sources (REST APIs, SFTP file drops, database extracts) → S3 → Glue ETL → Lake Formation
. Implement ETL transformations per R2 data mapping specifications
. Build cross-agency data sharing patterns: Agency B data → Central Platform (Lake Formation cross-account grants, resource links)
. Implement data lineage tagging using OpenLineage / AWS-native lineage metadata for governance audit trail
. Configure data quality checks for multi-source ingestion - handle schema drift, late-arriving data, source unavailability
. Write and maintain IaC (CDK/Terraform) for R2 pipeline resources
. Execute unit testing, integration testing, and cross-agency data access validation
. Support UAT with Agency B data owners - validate data accuracy, timeliness, and access controls
. Document pipeline configurations, source connectivity patterns, and data flow diagrams
. Participate in daily stand-ups, sprint demos, and code reviews
Required skills: AWS Glue, Lake Formation, S3, Athena, Python/PySpark, multi-source integration (APIs, SFTP, DB extracts), IaC, SQL
Job ID: 149231915
We don’t charge any money for job offers