Job Description Who We Are
Sage Artificial Intelligence Labs “SAIL” is a nimble team within Sage building the future of cloud business management by using artificial intelligence to turbocharge our users’ productivity. The SAIL team builds capabilities to help businesses make better decisions through data-powered insights.
We are currently hiring a Senior Data Engineer to help us build machine learning solutions that will provide insights to empower businesses and help them succeed. As a part of our cross-functional team including data scientists and engineers you will help steer the direction of the entire company’s Data Science and Machine Learning effort.
If you share our excitement for machine learning, value a culture of continuous improvement and learning and are excited about working with cutting edge technologies, apply today!
Work closely with our Data Scientists and ML engineers to build data warehouses and pipelines
Collaborate with our AI Infrastructure team to extend the capabilities of our machine learning platform
Monitor and optimize the quality and performance of our data pipelines and tools
Collaborate with other teams to integrate new data sources, ensuring privacy and security protocols are followed
Collaborate with data scientists and ML engineers to design and build data analytics tools for detecting data anomalies and ensuring data quality
Design, implement, and operate pipelines that deliver data with measurable quality and SLOs
Creating tools for establishing common data management patterns across our team and beyond
Write production-quality code to support our data pipelines and machine learning systems
Inform our strategy for data governance, security, privacy, quality, and retention
Work with our AI Infrastructure team to extend our capabilities , curate new data sets, and manage the data that drives our machine learning platform
Work with ML Engineers and Data Scientists to refine and specify data products that satisfy business policies and requirements
Exploratory data analyses and investigations.
2+ years of relevant practical experience working on data integration, data modelling and metadata management
Expert knowledge and experience with several relevant languages, tools and frameworks: e.g. Python, SQL, Scala, Spark
Database experience: e.g. you know how to write and optimize SQL queries, design efficient schemas for OLAP and OLTP and understand differences and tradeoffs between them
Willingness to adapt to significant changes in either technology or environment
Bachelor’s degree, preferably in a field that requires data management and manipulation (e.g. statistics, applied math, computer science, or a science field with direct statistics applications)
Fluency in data fundamentals: SQL, data management, and data manipulation using a procedural language,
Strong quantitative and analytical skills with minimum 2 years of experience with building data-intensive applications,
experience with one or more workflow management technology: airflow, argo, etc.
Deep understanding of relational as well as big data techniques and technologies (e.g. Postgres/mysql, spark, data warehousing (s3, Redshift, Snowflake, etc.))
Ability to communicate complex ideas to non-technical stakeholders and alternate between big-picture and implementation.
You May Have
Previous experience working with financial services
Experience developing and operating machine learning pipelines with Kubeflow or Argo
Hands-on experience with one or more ML pipeline automation frameworks MLFlow, Kubeflow, or TFX.
Advanced SQL skills either for DB management or analysis
Deep experience with these things: data warehousing, schema management, timeseries datasets, data validation, synthetic data generation, serialization protocols, data privacy and security
Ability to wrangle data like a pro alligator wrestler and come out relatively unscathed.
Curiosity about applications of machine learning in your personal life.
What’s it like to work here
You will have an opportunity to work in an environment where Data Science is central to what we do. The products we build are breaking new ground, and we have a focus on providing the best environment to allow you to do what you do best – solve problems, collaborate with your team and push first class software. Our distributed team is spread across multiple continents, we promote an open diverse environment, encourage contributions to open-source software and invest heavily in our staff. Our team is talented, capable and inclusive. We know that great things can only be done with great teams and look forward to continuing this direction.
Key Responsibilities * Design, implement, test, deploy and monitor data pipelines in a way optimal for the functional and non-functional goals of the project
- Design, implement, test, deploy and monitor internal data tools to streamline, automate the team’s internal workflows
- Contribute to the team’s agile practices around collaboration, planning, continuous improvement.