This is a remote position and we are hiring candidates from the whole country. AgileEngine is one of the Inc. 5000 fastest-growing companies in the US and a top-3 ranked dev shop according to Clutch. We create award-winning custom software solutions that help companies across 15+ industries change the lives of millions. If you like a challenging environment where you're working with the best and are encouraged to learn and experiment every day, there's no better place - guaranteed! What you will do Design a clear and lean data model that outlines data sources and transformations over this data on top of DAGs and data orchestration tools like Dagster or Airflow. Data validation and data model testing on each DAG step. Insights Layer Ownership: Build data models and algorithms to generate first-party data using statistical and machine learning techniques, including LLMs and natural language processing. Generate derived insights and determine accurate values from error-prone sources (e.g., headcount information). Data Product Development: Develop and enhance data products to improve the discoverability of meaningful knowledge and information in our database. Continuously improve similarity, relevance, normalization, and tagging algorithms that power our search engine. Pipeline Maintenance: Oversee data pipelines' maintenance and health to ensure accurate, efficient, and optimal data transformations by avoiding repetitive tasks or operations within the data. Team Collaboration: Collaborate with the team to devise product goals, outline milestones, and execute plans with minimal guidance. Data Warehouse Design: Contribute to the design of a robust data warehouse architecture by following best practices and industry standards. Transferring data from S3, loading data with different schedules, and managing different data pipelines on top of a unique warehouse architecture. Collaborate with our platform team to make design decisions on the optimal middle-layer database flow improving DAG execution times and costs. Must haves +4 years of experience as a Data Engineer. Programming Languages: Python, SQL. Orchestration Tools: Airflow, Dagster. Data Warehouses: Snowflake, Databricks. ETL Tools: DBT Models. Containerization: Docker. DevOps: AWS. Databases: Clickhouse, Postgres, DuckDB. Upper-intermediate English level. The benefits of joining us Professional growth: Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps. Competitive compensation: We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities. A selection of exciting projects: Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands. Flextime: Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office – whatever makes you the happiest and most productive. #J-18808-Ljbffr