Search by job, company or skills

Qi Group

Spark Engineer

This job is no longer accepting applications

new job description bg glownew job description bg glownew job description bg svg
  • Posted 2 months ago

Job Description

We are seeking a highly skilled Spark Engineer to design, build, and optimize large-scale data processing systems using Apache Spark. The ideal candidate will have deep expertise in distributed data processing, ETL pipelines, and performance tuning for high-volume data environments. You will collaborate with data scientists, analysts, and engineers to ensure scalable, reliable, and efficient data solutions.

Key Responsibilities:

  • Design, develop, and maintain big data solutions using Apache Spark (Batch and Streaming).
  • Build data pipelines for processing structured, semi-structured, and unstructured data from multiple sources.
  • Optimize Spark jobs for performance and scalability across large datasets.
  • Integrate Spark with various data storage systems (HDFS, S3, Hive, Cassandra, etc.).
  • Collaborate with data scientists and analysts to deliver robust data solutions for analytics and machine learning.
  • Implement data quality checks, monitoring, and alerting for Spark-based workflows.
  • Ensure security and compliance of data processing systems.
  • Troubleshoot and resolve data pipeline and Spark job issues in production environments.

Required Skills & Qualifications

  • Bachelor's degree in Computer Science, Engineering, or related field (Master's preferred).
  • 3+ years of hands-on experience with Apache Spark (Core, SQL, Streaming).
  • Strong programming skills in Scala, Java, or Python (PySpark).
  • Solid understanding of distributed computing concepts and big data ecosystems (Hadoop, YARN, HDFS).
  • Experience with data serialization formats (Parquet, ORC, Avro).
  • Familiarity with data lake and cloud environments (AWS EMR, Databricks, GCP DataProc, or Azure Synapse).
  • Knowledge of SQL and experience with data warehouses (Snowflake, Redshift, BigQuery is a plus).
  • Strong background in performance tuning and Spark job optimization.
  • Experience with CI/CD pipelines and version control (Git).
  • Familiarity with containerization (Docker, Kubernetes) is an advantage.

This is a 12-month contract appointment, with the potential for extension or conversion to a permanent position subject to performance and business needs.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 126510469