Data Engineer

axrail.ai

Malaysia, Kuala Lumpur

4-6 Years

Save

Posted 14 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Role Overview:

We are seeking a highly skilled Cloud Data Engineer to design, build, and optimize data pipelines, AI-driven solutions, and cloud-based architectures. This role is ideal for individuals with strong logical and analytical thinking skills and hands-on experience with ETL processes, dashboards, and data engineering using PySpark. This is an exciting opportunity to work on Generative AI innovations and AWS native technologies in a hybrid work environment.

Key Responsibilities:

1. Data Pipeline and ETL Development:

Build, optimize, and manage ETL pipelines using PySpark and AWS Glue for large-scale data processing.
Design robust data workflows for processing structured and unstructured data.
Ensure data integrity and security in all stages of processing.

2.Dashboard and Data Visualization for Data engineer:

Design and develop dashboards using tools like AWS QuickSight, Tableau, or Power BI.
Collaborate with stakeholders to create insightful visualizations for data-driven decision-making.

3. AI/ML Model Development and Deployment:

Develop, deploy, and maintain AI/ML models using frameworks such as TensorFlow, PyTorch, or Scikit-learn.
Implement models on cloud platforms using AWS SageMaker and automate model training and deployment pipelines.

4. Cloud Infrastructure and Data Management:

Architect and deploy scalable data solutions using AWS services like Redshift, EMR, and S3.
Use Infrastructure as Code tools (e.g., Terraform, AWS CDK, or CloudFormation) to automate deployments.

5. Performance Optimization:

Optimize ETL pipelines, AI models, and data queries for performance, cost-efficiency, and scalability.
Monitor data workflows and resolve bottlenecks proactively.

6. Explore AWS and Generative AI Innovations:

Gain hands-on experience with Generative AI tools and frameworks to create innovative data and AI solutions.
Experiment with the latest AWS native technologies to enhance data pipelines and AI projects.

Requirements:

Education: Bachelor's degree in Computer Science, Data Science, Engineering, or a related field. Equivalent practical experience will also be considered.
Hands-on experience with ETL pipelines using PySpark and data transformation tools like AWS Glue.
Proficiency in building interactive dashboards with tools like Tableau, AWS QuickSight, or Power BI.
Strong programming skills in Python (preferred) or other languages for data processing and AI/ML development.
Familiarity with cloud platforms (AWS preferred) and services like S3, Redshift, and SageMaker.
Strong logical and analytical thinking skills for solving complex data problems.
Knowledge of SQL and database management systems.

Preferred Skills:

Relevant AWS Certifications (e.g., AWS Certified Data Analytics, AWS Certified Machine Learning) are a strong plus.
Senior candidates (4+ years) should demonstrate expertise in PySpark, dashboard development, large-scale data processing, and AI/ML model deployment.
Familiarity with monitoring tools for data pipelines and AI workflows.
Strong communication skills for collaborating across teams and presenting data insights.