First, you’re probably a good fit if:
- You appreciate the difference between a CSV and a PSV file (because again and again, you had to parse a CSV file that didn’t conform to the “standard”). We’re not even mentioning TSV.
- You enjoy opening files in a HEX editor and figuring out the data layout
- “API Docs? Who needs that?” – you, on a typical Tuesday afternoon
- You got excited when Python introduced data classes
Why are data engineers important?
As artificial intelligence evolves from simple automation to sophisticated agentic systems capable of independent decision-making and action, the data integration engineer has emerged as one of the most critical roles in any organization’s technical foundation. Agentic AI systems are only as intelligent as the data they can access and understand—and in most enterprises, that data exists in fragmented silos, speaking different languages, structured in incompatible formats, and scattered across dozens of platforms.
The Role
We have developed our own cutting-edge ETL framework, which runs on top of AWS Lambda, SQS, and S3, allowing us to maintain near real-time data freshness (we’re talking seconds). Having our own solution allows us to have the ETL solution much more integrated with the rest of the system, as opposed to just keeping it as a boring unit that lives outside of it. Additionally, it allowed us to be much more efficient when it comes to cost and avoid the typical lambda cost when you reach a significant scale.
You’d find yourself communicating with entities outside the company regularly – IT personnel of big liquor store chains/distributors/brands, solution engineers and developers of POS companies (think Square, Toast, Netsuite, etc.).
Responsibilities
- End-to-end responsibility (analyze, design, develop, test, and deploy) on data integration pipelines based on City Hive cloud ETL framework.
- Engage directly with customers to access their POS data and understand their data model.
- Maintain data pipelines as data and business requirements change.
- Understand City Hive product and the implications of the data on it.
- Maintain a high level of service with regard to data and integration questions and issues.
Qualifications
- 2+ years of hands-on development experience with data pipelines / ETL in Python.
- You wrote at least 5 decorators in Python in the last year 🙂
- You’re not afraid to learn Ruby (because we also use that), or you already used it
- An all-around player, with a start-up mentality, who doesn’t mind getting their hands dirty with whatever it takes to get things done.
- Advanced working SQL knowledge and experience working with relational databases, as well as working familiarity with a variety of other data sources (APIs, raw files, etc.).
- Ability to analyze data to identify deliverables, gaps, and inconsistencies.
- Good familiarity with IT tools.