Senior Data Engineer (AWS & Confluent Data/AI Projects)

  • Manila City, Metro Manila
  • Permanent
  • Full-time
  • 1 month ago
We are seeking a highly skilled and experienced Senior Data Engineer to join our growing team. This role will be pivotal in designing, building, and maintaining robust and scalable data pipelines and infrastructure, with a strong focus on real-time data streaming using Confluent Kafka, leveraging the full power of AWS cloud services, and enabling advanced Data and AI initiatives. The ideal candidate will possess a deep understanding of data engineering best practices, a proven track record of delivering complex data solutions in a cloud-native environment, and a passion for working with cutting-edge technologies to unlock data-driven insights. Key Responsibilities: Architect and Design Data Solutions: Lead the design and architecture of scalable, secure, and efficient data pipelines for both batch and real-time data processing on AWS. This includes data ingestion, transformation, storage, and consumption layers. Confluent Kafka Expertise: Design, implement, and optimize highly performant and reliable data streaming solutions using Confluent Platform (Kafka, ksqlDB, Kafka Connect, Schema Registry). Ensure efficient data flow for real-time analytics and AI applications. AWS Cloud Native Development: Develop and deploy data solutions leveraging a wide range of AWS services, including but not limited to: Data Storage: S3 (Data Lake), RDS, DynamoDB, Redshift, Lake Formation. Data Processing: Glue, EMR (Spark), Lambda, Kinesis, MSK (for Kafka integration). Orchestration: AWS Step Functions, Airflow (on EC2 or MWAA). Analytics & ML: Athena, QuickSight, SageMaker (for MLOps integration). Data Pipeline Development: Build, optimize, and maintain complex ETL/ELT pipelines using Python, PySpark, Scala, or Java, ensuring data quality, reliability, and performance. AI/ML Integration: Collaborate closely with Data Scientists and Machine Learning Engineers to integrate data pipelines that support the training, deployment, and monitoring of AI/ML models, including potential applications of Generative AI. Data Governance & Quality: Implement and enforce data governance policies, data quality checks, and data security best practices (e.g., IAM roles, encryption, VPC networking) across all data assets. Performance Optimization: Continuously monitor, tune, and optimize data processing jobs, data storage, and streaming infrastructure for performance, cost-efficiency, and scalability. Mentorship and Leadership: Provide technical guidance, mentorship, and support to junior data engineers, fostering a culture of technical excellence and continuous learning. Collaboration & Communication: Work effectively with cross-functional teams including Data Scientists, Business Analysts, Product Managers, and other Engineering teams to translate business requirements into robust technical solutions. Documentation: Create and maintain comprehensive technical documentation, including data flow diagrams, architectural designs, and operational runbooks. Stay Current: Research and evaluate new technologies, tools, and best practices in data engineering, cloud computing, and AI/ML to drive innovation. Required Skills and Qualifications: Bachelor&aposs or Master&aposs degree in Computer Science, Software Engineering, or a related quantitative field. 3 to 5 years of experience in data engineering, with a significant focus on cloud-based solutions. Strong expertise in AWS data services (S3, Glue, EMR, Redshift, Kinesis, Lambda, etc.). Extensive hands-on experience with Confluent Platform/Apache Kafka for building real-time data streaming applications. Proficiency in programming languages such as Python, PySpark, Scala, or Java . Expertise in SQL and experience with various database systems (relational and NoSQL). Solid understanding of data warehousing, data lakes, and data modeling concepts (star schema, snowflake schema, etc.). Experience with CI/CD pipelines and DevOps practices (Git, Terraform, Jenkins, Azure DevOps, or similar). Familiarity with big data processing frameworks like Apache Spark. Proven ability to design and implement scalable, fault-tolerant, and secure data architectures . Experience working on data projects that support Machine Learning and Artificial Intelligence initiatives . Strong problem-solving skills and the ability to troubleshoot complex data issues. Excellent communication, interpersonal, and collaboration skills. Preferred Qualifications (Nice to Have): AWS Certifications (e.g., AWS Certified Data Analytics - Specialty, AWS Certified Solutions Architect - Associate/Professional). Experience with other streaming technologies (e.g., Flink). Knowledge of containerization technologies (Docker, Kubernetes). Familiarity with Data Mesh or Data Fabric concepts. Experience with data visualization tools (e.g., Tableau, Power BI, QuickSight). Understanding of MLOps principles and tools. Benefits: This role offers a unique opportunity to contribute to high-impact data and AI projects, working with a dynamic team and leveraging the latest cloud and streaming technologies. If you are a passionate and skilled Senior Data Engineer looking to make a significant impact, we encourage you to apply Benefits as per Philippines law applicable Filipino Candidates will be given preference Show more Show less

foundit