Agilisium Agilisiumonshore logo

Big Data Architect (Multiple Positions) | ZR_795_JOB

Agilisium Agilisiumonshore
Posted 2 weeks ago
Relocation support
United States
Data & Analytics

Support summary

Relocation support

Explicitly identified in the job description.

Visa sponsorship

No visa sponsorship identified.

About this role

Job Description EMPLOYER: Agilisium Consulting LLC TITLE: Big Data Architect (Multiple Positions) LOCATION: Westlake Village, CA and various and unanticipated locations throughout the U.S. (Must be willing to work anywhere in the U.S. as the position may involve relocation to various and unanticipated client site locations; any relocation to be paid by employer pursuant to internal policy.) DUTIES: Architect and lead the design of end-to-end data integration and analytics solutions. Develop scalable data lakehouse architectures following Medallion principles (Bronze, Silver, and Gold) to enable structured ingestion, transformation, and consumption of enterprise data. Build PySpark and Python-based ETL frameworks in Databricks for incremental and event-driven ingestion. Design and operationalize a data quality framework for proactive data validation and automated business alerts. Establish metadata-driven orchestration and reusable ETL components to ensure consistency and reusability across projects. Integrate regional and competitor data sources into the enterprise data lake with strong data governance and lineage. Oversee environment stability, configuration, and access management within Databricks and AWS ecosystems, ensuring compliance with enterprise security standards. EOE REQTS: Must have a Bachelor’s degree or foreign equivalent in Computer Science, Information Technology, Data Science, or a related field plus five (5) years of experience in the position offered, as a Big Data Specialist, Software Engineer, or a related position. Must have five (5) years of experience with all of the following: Designing and implementing high-volume data lakehouses and ETL frameworks on AWS, utilizing services including S3, Lambda, EC2, Glue, and EMR for large-scale distributed data processing; Hands-on developing and maintaining PySpark-based ETL pipelines in Databricks, implementing Delta Lake with ACID transactions, data partitioning, filter pushdown, and data skew resolution for performance optimization; Architecting and managing workflow orchestration using Apache Airflow, including configuration of Schedulers, Workers, and Executors, creation of custom operators, and implementation of log management and disaster recovery frameworks; Building and maintaining data pipelines integrated with Snowflake and other cloud data warehouses, including SQL development, query tuning, and role-based access configuration to ensure security and performance; Designing and implementing data quality and governance frameworks within the Medallion Architecture (Bronze, Silver, and Gold layers) to ensure data reliability, lineage, and compliance with enterprise governance standards; Improving ETL efficiency and cost performance with Spark architecture, including query plans, Adaptive Query Execution (AQE), shuffle management, dynamic partition pruning, and memory optimization; and Implementing data modeling, profiling, and standardization frameworks, applying scalable storage and archival strategies using AWS S3 Intelligent-Tiering and Glacier, and optimizing pipelines for high availability and fault tolerance. HOURS: Full-Time; Mon-Fri (40 hrs/week) SALARY: $175,000 - $175,000 per year

Similar jobs