Spark Engineer

Qi Group

Petaling Jaya, Malaysia, Selangor

3-5 Years

This job is no longer accepting applications

Posted 2 months ago

Job Description

We are seeking a highly skilled Spark Engineer to design, build, and optimize large-scale data processing systems using Apache Spark. The ideal candidate will have deep expertise in distributed data processing, ETL pipelines, and performance tuning for high-volume data environments. You will collaborate with data scientists, analysts, and engineers to ensure scalable, reliable, and efficient data solutions.

Key Responsibilities:

Design, develop, and maintain big data solutions using Apache Spark (Batch and Streaming).
Build data pipelines for processing structured, semi-structured, and unstructured data from multiple sources.
Optimize Spark jobs for performance and scalability across large datasets.
Integrate Spark with various data storage systems (HDFS, S3, Hive, Cassandra, etc.).
Collaborate with data scientists and analysts to deliver robust data solutions for analytics and machine learning.
Implement data quality checks, monitoring, and alerting for Spark-based workflows.
Ensure security and compliance of data processing systems.
Troubleshoot and resolve data pipeline and Spark job issues in production environments.

Required Skills & Qualifications

Bachelor's degree in Computer Science, Engineering, or related field (Master's preferred).
3+ years of hands-on experience with Apache Spark (Core, SQL, Streaming).
Strong programming skills in Scala, Java, or Python (PySpark).
Solid understanding of distributed computing concepts and big data ecosystems (Hadoop, YARN, HDFS).
Experience with data serialization formats (Parquet, ORC, Avro).
Familiarity with data lake and cloud environments (AWS EMR, Databricks, GCP DataProc, or Azure Synapse).
Knowledge of SQL and experience with data warehouses (Snowflake, Redshift, BigQuery is a plus).
Strong background in performance tuning and Spark job optimization.
Experience with CI/CD pipelines and version control (Git).
Familiarity with containerization (Docker, Kubernetes) is an advantage.

This is a 12-month contract appointment, with the potential for extension or conversion to a permanent position subject to performance and business needs.