Collect business requirements, definition of robust data models and architectures
Design, and build scalable and reliable data pipelines and workflows in cloud environments
Apply DevOps practices, including basic Git workflows and involvement in CI/CD pipelines.
Contribute to maintaining data quality, security, and data governance standards across all data-related activities.
Collaborate with cross-functional teams to ensure data solutions align with business needs and quality standards.
Specification and design of presentation interfaces with optimal usability/user experience
Document processes and tasks to ensure explainability and understanding across the team
Support the integration of AI-based enrichment and transformation processes into existing data pipelines and workflows.

The following knowledge2, experience and skills are required for the performance of the above listed tasks:

Business analysis & requirements gathering
Ability to collect, analyse and translate business needs into technical specifications.
Data modelling & architecture design
Skills in designing conceptual, logical and physical data models.
ETL/ELT and data integration
Ability to extract, transform, load, clean and merge datasets from multiple sources.
Building data pipelines & workflows
Experience with automated workflows and orchestration tools.
Big data management
Ability to handle large and complex datasets efficiently.

Specific expertise:

Εxcellent knowledge in Python, Spark and SQL
Εxcellent knowledge in designing and building ETL pipelines using tools such as Azure Synapse, Microsoft Fabric and/or AWS Glue
Εxcellent knowledge of data modelling and database design principles using the Medallion Architecture
Good knowledge of business intelligence tools, notably Microsoft Power BI

Knowledge of Machine Learning, Natural Language Processing and Large Language Models (LLMs) fundamentals

Additional skills:

Understanding of Microsoft Power Platform (e.g., Power Automate, SharePoint Lists)
Good knowledge with Microsoft Fabric components (Lakehouses, Pipelines, Dataflows Gen2, Notebooks, Semantic Models)
Good knowledge with cloud environments (AWS or Microsoft Azure)
Understanding of DevOps practices, including Git workflows and CI/CD pipelines with experience using tools such as Azure DevOps, GitHub, and GitLab.
Knowledge with no-code / low-code data science platforms such as KNIME and/or Dataiku.
Familiarity with European Commission IT ecosystem and best practices
Documenting and organising processes using task management tools (e.g., Jira, OpenProject) and documentation platforms (e.g., Confluence, GitLab Wiki, GitHub Wiki).

Data Scientist

Informationen