VNG Career Site Header

Chia sẻ việc làm

  • Logo Footer
  • Logo Footer

Công việc liên quan

Tìm công việc

Lead Data Engineer

OfficialDataData Engineering25-GDS-2699
locationtp.hồ chí minh
Xem mô tả bằng
Tiếng Việt

Mô tả công việc

We are looking for a candidate to lead GDS’s Data Engineering Team. The DE team are building GDS’s Data Warehouse/ Data Lake and multiple data infrastructure initiatives for VNGGames. 

The role requires taking ownership of a complex, partially-migrated data platform and lead the evolution of its infrastructure, tooling, and operations to serve both analytical and compliance needs (including SOX) at scale. 

The lead will also make architectural decisions and drive operational reliability while managing cross-team collaboration and legacy complexity. 

Responsibilities: 
  • Leading GDS’s data infrastructure initiatives projects · Maintain and evolve the internal YAML-based SDK for batch and streaming workloads
  • Migration legacy Hadoop to Spark based platform on K8s · Ensure GDS’s data integrity and accuracy as well as data management practice 
  • Build/maintain and improve our current data integration pipelines (streaming and batching ETL), testing and production deployment frameworks/architecture for all GDS data products
  • Keep the whole DE team updated with current best practices and maintaining data engineering standard 
  • Manage a team of data engineers (10-12 members) 
  • Provide mentorship to team members
  • Effective communication with leadership, tech and business teams

Yêu cầu

  • Bachelor or Master/PhD (plus+) degree in Computer Science, Data Science or a relevant field of studies
  • At least 7+ years experience in data engineering 
  • Extensive experience in designing and building ETL data pipeline, data models, data warehouse/ data lake, optimizing data pipeline and architecture
  • Experience with big data technologies, distributed framework (e.g Hadoop, Spark, Kafka), database technologies (e.g MySQL, PostgreSQL or NoSQL: Mongo), cloud technology (e.g AWS, GCP), scheduling tool (e.g Airflow)
  • Strong working knowledge of data structures, algorithms · Extensive experience in programming framework (Scala, Python, R, Java etc.)
  • Experience in working with real-time data and streaming application, experience with Spark Structured Streaming and Iceberg is a plus
  • Experience in CI/CD pipeline and Git · Careful, highly organized and self-learning attitude.
  • Strong leadership, project management and communication skills