We are seeking a highly skilled Spark Engineer to design, build, and optimize large-scale data processing systems using Apache Spark. The ideal candidate will have deep expertise in distributed data processing, ETL pipelines, and performance tuning for high-volume data environments. You will collaborate with data scientists, analysts, and engineers to ensure scalable, reliable, and efficient data solutions.
Key Responsibilities:
- Design, develop, and maintain big data solutions using Apache Spark (Batch and Streaming).
- Build data pipelines for processing structured, semi-structured, and unstructured data from multiple sources.
- Optimize Spark jobs for performance and scalability across large datasets.
- Integrate Spark with various data storage systems (HDFS, S3, Hive, Cassandra, etc.).
- Collaborate with data scientists and analysts to deliver robust data solutions for analytics and machine learning.
- Implement data quality checks, monitoring, and alerting for Spark-based workflows.
- Ensure security and compliance of data processing systems.
- Troubleshoot and resolve data pipeline and Spark job issues in production environments.
Required Skills & Qualifications
- Bachelor's degree in Computer Science, Engineering, or related field (Master's preferred).
- 3+ years of hands-on experience with Apache Spark (Core, SQL, Streaming).
- Strong programming skills in Scala, Java, or Python (PySpark).
- Solid understanding of distributed computing concepts and big data ecosystems (Hadoop, YARN, HDFS).
- Experience with data serialization formats (Parquet, ORC, Avro).
- Familiarity with data lake and cloud environments (AWS EMR, Databricks, GCP DataProc, or Azure Synapse).
- Knowledge of SQL and experience with data warehouses (Snowflake, Redshift, BigQuery is a plus).
- Strong background in performance tuning and Spark job optimization.
- Experience with CI/CD pipelines and version control (Git).
- Familiarity with containerization (Docker, Kubernetes) is an advantage.
This is a 12-month contract appointment, with the potential for extension or conversion to a permanent position subject to performance and business needs.