Company Overview: Lean Tech is a rapidly expanding organization situated in Medellín, Colombia. We pride ourselves on possessing one of the most influential networks within software development and IT services for the entertainment, financial, and logistics sectors. Our corporate projections offer many opportunities for professionals to elevate their careers and experience substantial growth. Joining our team means engaging with expansive engineering teams across Latin America and the United States, contributing to cutting-edge developments in multiple industries. Position Title: Senior Data Engineer Location: Remote - LATAM What you will be doing: The Data Engineer will be responsible for the development, optimization, and maintenance of data processes. They will be expected to troubleshoot and resolve related issues, ensuring data accuracy and integrity. Overall, the individual will assist in driving our data initiatives by designing, implementing, and maintaining effective data solutions that align with user requirements and organizational goals. Your responsibilities will include: Design, build, and optimize scalable ETL pipelines using Apache Spark on Amazon EMR. Work closely with data scientists, analysts, and other engineering teams to define, implement, and maintain high-performance data infrastructure. Develop and maintain automated data workflows and processes for efficient data ingestion, transformation, and loading. Implement best practices for data engineering, including monitoring, logging, and alerting for data pipelines. Collaborate with stakeholders to understand business requirements and translate them into technical solutions. Optimize performance of data processing jobs and troubleshoot issues with large-scale distributed systems. Drive innovation in data infrastructure, evaluating and integrating new tools, frameworks, and approaches. Requirements & Qualifications: To excel in this role, you should possess: 5+ years of experience in data engineering, with at least 3+ years working with Apache Spark and Amazon EMR. Strong programming skills in Python and Scala with a focus on performance tuning and optimization for Spark jobs. Proven experience working with SQL for data management, querying, and optimization is required. Deep understanding of distributed computing concepts, data partitioning, and resource management in large-scale data processing systems. Proficiency in building and maintaining ETL pipelines for structured and unstructured data. Hands-on experience with AWS services such as S3, Lambda, EMR, Glue, and RDS. Strong problem-solving skills and ability to debug complex systems. Preferred Qualifications: Experience with DevOps practices, including CI/CD, infrastructure as code (e.g., Terraform, CloudFormation), and containerization (e.g., Docker). Experience with Kubernetes and container orchestration for Spark jobs. Familiarity with streaming data processing using tools like Kafka, Kinesis, or Flink. Experience with modern data lake architectures, including Delta Lake or Iceberg. AWS Certification (e.g., AWS Certified Big Data – Specialty, AWS Certified Solutions Architect) is a plus. Why you will love Lean Tech: Join a powerful tech workforce and help us change the world through technology. Professional development opportunities with international customers. Collaborative work environment. Career path and mentorship programs that will lead to new levels. Join Lean Tech and contribute to shaping the data landscape within a dynamic and growing organization. Your skills will be honed, and your contributions will be vital to our continued success. Lean Tech is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. #J-18808-Ljbffr