Ignite Your Insights

Course Overview

Multiple speakers present a business case for applying and utilizing AI / ML (artificial intelligence and machine learning) in the Data Vault landscape.  These speakers then share the how-to techniques to dive deep in the methodology, and implementation or application of these concepts.  From the business use-cases and reasons for…

Clcik to Read the Full Description...

Learning Maximized!

Achieve expertise quickly

0
Course Hours

High quality content, self-paced video for your maximized learning.

0
Total Lessons

Extensive training with focused topics leading to your success.

0
Enrolled Students

Join other students currently engaged in your learning journey.

Course Lessons

Section Header

Speaker: Heli Helskyaho

Machine learning is here and it will stay. Everybody should know what it is about and where it could be used. So what is machine learning and where can it be used?

What is unsupervised learning, supervised learning, or reinforcement learning? What is deep learning? In this presentation we will talk about clustering, regression, classification, and so much more. We will talk about measuring, improving the model, feature selection, feature transformation, principal component analysis, hyper-parameter tuning and much more.

At the end of the session we discuss what else there is to learn and how to get started with machine learning.

In this presentation you learn what machine learning is all about and hopefully get so excited about the whole thing you want to learn more! After this presentation it will be easier to start learning.  This presentation is an improved version of a presentation awarded as the best presentation at KScope19 conference on Emerging technologies track.

Please note: Heli is a Certified Data Vault 2.0 Practitioner, and has been working with Data Vault for over 5 years.  This presentation is NOT to be missed!  She will tie machine learning to Data Vault across both of her presentations.

Speaker: Richard Strange

Can Data Vault help with data-driven academic research? Academic research is following the trends of the private sector, with the dual problems of growing volume and complexity of data. As reproducibility of data-driven research comes under greater scrutiny, the need to share data sets is more important than ever, but how do we bridge the gap between current practice and the necessary capabilities in research?

Topics Covered:

  • Current state and challenges of data-driven research
  • Case studies
  • Model approaches for applying Data Vault methods

Speaker: Bruce McCartney

The Business Vault is increasingly a critical component of the Data Vault 2.0 architecture. The purpose of this presentation is to discuss some of the Artificial Intelligence advances in automating 'soft' business rules.

The audience will learn various possible implementation patterns for automating business rules using Artificial Intelligence. This includes an overview of rules engines, machine learning, deep learning and causal inference taxonomies and techniques to augment the raw data into actionable insights. Additional introduction to some modelling terminology and techniques will give audience a high-level of understanding of the use of data science techniques to automate business rule development as well as some alternatives for auditing and debugging AI pipelines.

The presentation will outline various implementation patterns for using Artificial Intelligence in your information pipelines, including exposure to software tools fro development and management of these new paradigms. Finally a practical discussion of AI for augmenting data catalogs (identifying business keys) and how AI is used to forecast weather.

Topics Covered:

  • Business Vault Overview
  • Rules Engine Approach to Business Rules
  • Machine and deep learning Overview
  • Causal Inference

Speaker: Matt Florian

Title: Automated Machine Learning and Data Vault

With Python becoming ubiquitous with data engineering, there is an increasing trend to return to hand-coding the building of the data vault. This session will present a tale of one project that did both Python data engineering and metadata code engine development. It will highlight the strengths and weaknesses of each approach using real-world examples of each. We will review the strength of the data sourcing, modeling, profiling, and consumption. It will conclude with a breakdown of which approach won the day for the project and the benefits achieved.

In the initial iteration of the project, a target logical data model was created to integrate two ERP systems into a single data platform. The goal was to enable historical data analytics to support a rolling deployment of SAP across the enterprise. The decision was made at the outset to use Python with AWS Glue to consume data from S3 raw lake and land it into a Snowflake database. With a team of data engineers, the initial data was landed easily. However, the velocity of the team was limited and the scope of what could be accomplished in the target timeline was continually scaled back. This meant the team could not release the data value at the speed needed by the business.

An alternative approach was then adopted to use a metadata-driven development tool to achieve the desired velocity. This required retooling Python developers into being data modelers and IDE developers. Once the team was able to move past the learning curve, the velocity increased, the quality of data increased, and the value to the business was realized. The outcome was a data platform consisting of well-thought-out data and technical architecture to meet the needs of data and knowledge workers.

Major Topics Covered
• Data Vault 2.0 on Snowflake
• Python in AWS Glue
• Wherescape 3D & Red
• Metadata Code Generator

Agenda
• Establish Client Scope for ERP Data Integration
• Hand Coded Development Architecture
• Project Achievements and Outcomes
• Python Strengths and Weakness Retrospective
• Metadata Code Generator Architecture
• Project Achievements and Outcomes
• Code Generator Strengths and Weakness Retrospective
• Approach Comparison and Learnings

Section Header

Speaker: Heli Helskyaho

A successful machine learning needs a team of people with different skills, and you are only one person with the skillset you have. How to get started with machine learning? Do I need to go back to school to learn mathematics and statistics again? Do I need to learn all the machine learning processes, algorithms, hyperparameters and whatever is related to machine learning?
Panic!

But luckily there is no need for a panic: AutoML is for you. Automated machine learning (AutoML) is a shortcut to machine learning. It automates many of the steps on machine learning process, and lets you concentrate on your expertise: data and Data Vault. In this presentation we talk about AutoML and see some demos on using automated machine learning on Data Vault data.

Major Topics Covered
  • AutoML
  • Machine Learning
Agenda
  • What is AutoML
  • Why and when AutoML
  • How to use AutoML

Speaker: Heli Helskyaho

The biggest problem with machine learning is data. There is not enough data, the data is not correct, etc. Data Vault is an excellent source for machine learning since it has good quality data that has been modeled, understood, and checked with the Data Vault methodology processes.

How to do machine learning in a Data Vault database? What tools are available and how would it be best to get started.
In this presentation we will talk about these questions and see some examples of machine learning in real life.

Topics Covered:

  • why Data Vault is a great source for Machine Learning
  • how to use your Data Vault data for Machine Learning

Speakers: Heli Helskyaho and Mattias Helskyaho

You are a business expert and a real guru with data, but why does machine learning seem so difficult?

Because successful machine learning needs a team of people with different skills—and you're only one person.

How do you get started with machine learning?

Do you need to go back to school to relearn mathematics and statistics?

Do you need to study all the processes, algorithms, hyperparameters, and whatever is related to machine learning?

Luckily there is no cause for panic. In this presentation we discuss two easy ways of starting with machine learning: AutoML and AI services.

Presentation Topics: AutoML, AI Services, Machine Learning, analytics

Speaker: Hannu Jarvi

Machine Learning / Artificial Intelligence is largely about recognizing patterns in data.
You probably have seen, on TV news reports, stories about crime and noticed how the accused perpetrator’s or victim’s face has been blurred or pixelated.

The resolution has been lowered to hide the patterns that make that person recognizable. Something very similar happens when the granularity of data changes in traditional Data Warehousing. The data is “pixelated”. When Machine Learning is no longer able to recognize patterns in the data as a result “pixelated” data; that Information is lost forever.

Data Vault is the first Data Warehousing paradigm that seeks to preserve all information, thus it is also the first paradigm that provides a solid foundation for ML/AI.

The Machine Learning pipeline itself creates a lot of new information. In addition to the end result – for example, an evaluation of an X-ray - it provides a lot of intermediate information: Alternative explanations with varying probabilities, different results with different parameter values, etc. - information that may eventually prove valuable when analyzed against actual outcomes.

The volume and complexity of this kind of information may grow exponentially. A Data Lake, the data management workhorse of ML/AI community, cannot deal with this complexity. The information is “Lost in the Exhaust” of the ML pipeline.

This presentation explores ideas for capturing ALL value created in ML/AI pipelines. The ideas are based on challenges faced by some leading ML scientists in Healthcare and Pharma, and present the opportunities provided by Data Vault 2.0 for solving these challenges.

Topics Covered:

  • Why AI / ML needs pristine data in its original grain
  • Why DV is an ideal foundation for AI / ML
  • Why Dimensional Models don't work for AI / ML

Why Choose Membership

benefit 1

Access a wealth of collective knowledge

benefit 2

Foster cross-industry perspectives

benefit 3

Adapt your strategies to evolving industry dynamics

benefit 4

Seek guidance and validation for your ideas

Experience the power of collaborative problem-solving as you engage with fellow professionals, guided by seasoned experts in the field. Join me and thousands of others around the world to enrich your experience.

An image of the world for finding Data Vault training around the world

Get Your Membership

Elevate your potential, expand your horizons, and become a driving force in the ever-evolving landscape of data management with our premium Professional Membership.

Shopping Cart