Databricks Engineer
rinf.tech Vezi toate joburile
- București
- Permanent
- Full-time
- At rinf.tech, you'll encounter friendly people who are eager to explore and reinvent the world of technology.
- We encourage ideas - we like to share and learn from each other. We're all in for curious & ambitious people.
- We continuously invest in developing core teams focused on technologies like Blockchain, AI, and IoT -
- Our Technical Management team, possesses a robust technical background. Many of our team members have advanced to strategic roles through internal promotions.
- In a state of mutual willingness to share & grow, our RINFers commit to a minimum tenure of 2.5 years on a project.
- Fail fast, learn fast: we experiment, we iterate, we know when to stop and we don't repeat the same mistakes.
- The right technology stack for the right problem: we don't force technology choices just because we know them; our focus is on solving problems, not on pushing predefined stacks.
- Adapta Robotics is a successful spin-off born through an R&D project within rinf.tech
- Design, develop, and maintain production-grade ETL/ELT pipelines using Apache Spark on Databricks
- Build robust data ingestion workflows from multiple sources including databases, APIs, streaming platforms, and cloud storage
- Implement and optimize Delta Lake architectures following medallion architecture patterns (Bronze, Silver, Gold)
- Develop real-time streaming data pipelines using Structured Streaming and Delta Live Tables
- Design and implement scalable data lakehouse solutions on Databricks
- Optimize Spark jobs for performance, cost-efficiency, and reliability
- Configure and manage Databricks workspaces, clusters, and compute resources
- Implement data partitioning, indexing, and caching strategies for optimal query performance
- Implement CI/CD pipelines for Databricks notebooks, jobs, and workflows
- Collaborate with data scientists to productionize ML models using MLflow
- Implement data governance policies using Unity Catalog
- Document technical designs, data lineage, and operational procedures
- Participate in code reviews and mentor junior team members
- Ensure compliance with data security, privacy, and regulatory requirements
- Proven track record of building production-scale data pipelines processing large datasets
- Experience with at least one major cloud platform (Azure, AWS, or GCP)
- Strong proficiency in Python and SQL; Scala experience is a plus
- Databricks: Expert knowledge of Apache Spark (PySpark, Spark SQL), Delta Lake, Delta Live Tables
- Data Processing: Experience with batch and streaming data processing patterns
- Cloud Platforms: Proficiency with Azure Data Lake Storage, AWS S3, or GCP Cloud Storage
- Version Control: Git/GitHub for code collaboration and version management
- Orchestration: Experience with workflow orchestration tools (Databricks Workflows, Airflow, Azure Data Factory)
- Strong understanding of distributed computing concepts and data architecture
- Experience with data modeling, dimensional modeling, and data warehouse design
- Knowledge of data quality frameworks and testing methodologies
- Ability to write efficient, maintainable, and well-documented code
- Strong problem-solving skills and attention to detail
- Apply
- CV screening
- HR Interview
- Technical Interview
- Offer presented by our CEO