Lowongan kerja Data Analyst (P2P Project)

  • Full Time
  • Jakarta
  • Posted 3 years ago

Kirim Lamaran. Belum punya akun? Daftar sekarang.

Lokasi Loker

Job Description

About the Job
TaniHub Group is currently looking for skilled, proactive, and motivated data engineers to join our team.

 

We are a growing agritech company trying to solve the following data engineering challenges:

  • Complex, high frequency data ingestion and warehousing for a busy fresh produce supply chain and e-commerce data analytics needs
  • Highly regulated data ingestion and warehousing for financial reporting
  • Large-scale, low latency data pipelining, logging, and event tracking for Advanced Analytics and Machine Learning use cases

 

Successful candidates shall empower company initiatives by building tools, infrastructure, frameworks, and services to get data for data analysts to query and for data scientists to model.

 

Responsibilities

  • Design, build, and maintain high-performance and scalable data ingestion and transformation pipelines using modern technology such as serverless cloud data warehouse, CI/CD source control, and open source workflow orchestrator
  • Maintain good communication with Software Engineers, Product Managers, and Business Stakeholders of varying functions to ensure the veracity and the timeliness of the data processed
  • Collaborate closely with Data Analysts, Data Scientists and Machine Learning Engineers in delivering reliable data products
  • Maintain high-quality technical documentation in company’s internal Wiki and conduct regular knowledge sharing, hands-on training, and mentoring for the less experienced team members
  • Be the VP’s and Data Manager’s technical counsellor/researcher in designing a future-ready data infrastructure

 

Basic Qualifications

  • 3+ years of Data Engineering, or equivalent
  • Fluency in Python, SQL, and Bash scripting for data ingestion and transformation pipeline
  • Familiarity with various flavour of relational SQL databases such as Postgres and MySQL among others
  • Experience in setting up batch workflow orchestrator using Apache Airflow, Prefect, Dagster, Azkaban, or equivalent technologies
  • Experience in setting up ELT pipelines using a serverless Cloud Data Warehouse, preferably Google BigQuery or AWS Redshift
  • Familiarity with Apache Spark for batch data processing
  • Experience in setting up data pipeline CI/CD using tools such as Jenkins or Gitlab CI
  • Familiarity with standard Software Engineering practices such as object oriented programming, modular design, logging, and alerting, among others

Preferred Qualifications

  • 5+ years of Data Engineering, or equivalent
  • Fluency in Python, SQL, Scala/Java/Go and Bash scripting for data ingestion and transformation pipeline
  • Familiarity with various flavour of relational SQL databases (Postgres, MySQL, etc) and NoSQL databases (Mongodb, Redis, etc)
  • Experience in setting up batch workflow orchestrator using Apache Airflow, Prefect, Dagster, Azkaban, or equivalent technologies plus having familiarity with open source ELT integrators such as Singer, Meltano, or Airbyte
  • Experience in setting up ELT pipeline using a serverless Cloud Data Warehouse, preferably Google BigQuery or AWS Redshift, and is capable of integrating event streaming data into it using Change Data Capture (CDC), PubSub/Kinesis, Apache Kafka or equivalent technologies
  • Familiarity with Apache Spark/Flink/Beam for both batch and streaming data processing
  • Experience in setting up data pipeline CI/CD along with unit and integration testing using tools such as Jenkins or Gitlab CI
  • Fluency in standard Software Engineering practices such as object oriented programming, modular design, logging, and alerting, among others
  • Familiarity in setting up ELT data build transformation tools such as dbt and integrate them with aforementioned batch workflow orchestrators.