Data Engineer

  • Type: Full-time
  • Job #181964
Category:
Information Technology
Industry:
Nuclear
Type:
Full-time
Location:
,
Job ID:
#181964

Duration: 2 years

Location: 700 University (889 or Whitby)

 

Job Overview

  • Design & build large-scale data pipelines and data infrastructure leveraging the wide range of data sources across the organization
  • Document & assist in developing best practice, data delivery solutions using enterprise data ingestion, ETL and data management tool
  • Work closely with infrastructure teams to ensure an optimal data & advanced analytics platform, for current and future state
  • Clean, prepare and optimize datasets, ensuring lineage and quality controls are applied throughout the data integration cycle
  • Support Business Intelligence Analysts in modelling data for visualization and reporting
  • Stay current with advanced technologies, including AI/Machine learning, Data Management, and Cloud Data Storage techniques
  • Create & document efficient data pipelines (ETL/ELT)
  • Write and optimize complex queries on large data sets
  • Transform data and map them to more valuable and understandable sets for consumption
  • Create tooling to help with day-to-day tasks
  • Troubleshoot issues related to data accuracy and create the source of truth
  • Help remove the friction from other members of the organization and allow them to focus on their primary objective
  • Introduce new technologies to the environment through research and POCs
  • Reduce toil by automation
  • Collaborate with business analysts, data scientists, software engineers, and solution architects to develop data pipelines to feed our data marketplace
  • Extract, analyze & interpret large, complex datasets for use in predictive modelling
  • Utilize Azure tools to develop automated, productionized data pipelines
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Develop and support ETL code for data warehouse and data marts to support the reporting and data analytic systems.
  • Work with tools in the Microsoft Stack; Azure Data Factory, Azure Data Lake, Azure SQL Databases, Azure Data Warehouse, Azure Synapse Analysis Services, Azure Databricks, and Power BI.

Qualifications

  • Bachelor’s degree in Computer Science, Engineering or a related field
  • 3 years of proven experience as a Data Engineer in a Big Data environment
  • Knowledge in various advanced data mining techniques
  • Experience with integrating structured and unstructured data across various platforms and sources
  • Experience working with SAP as a data source is preferred
  • Proficient in SQL database management systems
  • Confident & versatile IT professional with the ability to communicate effectively across all levels of the Business and IT community
  • Strong CS fundamental, Data Structure and Algorithm knowledge
  • Strong understanding of Statistics
  • Experience working and preparing data for Data Science / Machine Learning models preferred
  • Experience with Azure Data Lake preferred
  • Experience creating ETL jobs using Azure Pipeline and Dataflow
  • Strong knowledge with programming methodologies (version control, testing, QA) and agile development methodologies.
  • In-depth knowledge of Azure tools required to develop automated, productionized data pipelines
  • In depth knowledge of and experience with relational, SQL and NoSQL databases
  • Fluency with SQL, R and Python (pandas, boto3, scikit-learn, sparkmagic)
  • Experience working with large, complex datasets
  • Excellent communication, writing and interpersonal skills
  • Ability to prioritize competing requests and multiple tasks in a fast-paced, deadline driven environment
  • Experience managing a project backlog and working cross-functionally with multiple stakeholders
  • Ability to work effectively on a self-organizing team with minimal supervision
  • Proactive and creative problem solver with the ability to multitask and manage tight deadlines
  • Power BI experience is a big plus