World Wide Data Vault Consortium (WWDVC)

Unlocking Global Insights With Invaluable Data Vault Knowledge

Gain a competitive edge with access to past conference videos through our Professional Membership. Explore expert insights and innovative solutions in data management, staying ahead in this dynamic field. Whether you’re a seasoned pro or new to the game, these videos provide valuable knowledge to boost your career and enhance your organization’s data strategy. Don’t miss out – unlock the power of data today!

About Your Access

A yearly subscription to our conference videos featuring Data Vault thought leaders provides exceptional value, offering access to cutting-edge insights and strategies.  Stay at the forefront of data management and continuously enhance your knowledge and skills. Invest in your professional growth with a Professional Membership today!

Conference Details

World  Wide Data Vault Consortium 2021 Our global conference, this year was 100% virtual, combining live-stream, and pre-recorded sessions.  By the numbers:

  • 32 Presentations
  • 8 Platinum Sponsors
  • 3 Gold Sponsors
  • 4 Silver Sponsors
  • 120 Attendees
  • 35 Sponsor Team Members
Themes included: Data Hubs, Artificial Intelligence (AI / ML / DL), Automation, Data Vault Case Studies We had a blast with Data Vault Games!  To return in person next year.

Not a Member Yet?

Unlock the power of knowledge and transform your data management expertise with our conference membership. Gain exclusive access to invaluable insights, strategies, and innovations shared by top industry leaders.

Don’t miss out – seize this opportunity to supercharge your professional growth and gain the competitive edge with your conference membership!

Check out these incredible numbers

The amount of knowledge packed into these sessions is phenomenal. Don't miss this opportunity to learn fantastic new ideas.

0 +
Attendees
0
Sponsors
0 +
Speakers
0 +
Content Hours

Conference Sessions

Speakers: Jeff Dodds, Christopher Siegfried

Buy vs. Build - The West Ada School District Data Vault story

West Ada is the largest school district in Idaho (40,000+ students) with the smallest per pupil funding in the country for enrollment levels over 10,000 students. Every dollar is carefully spent to maximize the District's value for increasing student achievement.

The West Ada story happened organically and by accident in the beginning when creating a data culture. A small data team was assembled with the charge to develop student-based analytics to help provide information for increasing student learning achievement. With no budget, it was decided to use existing in-house technology, such as SQL, SSIS, and other Microsoft tools to begin centralizing the District's data from multiple sources. However, this approach was time consuming and required more resources than was assigned to the project. After some resource adjustments, the District was able to implement a data warehouse; however, the time to maintain and support this development was significant. We were lucky to be associated with other companies in the area that had started using Data Vault and we become very interested in the opportunities this architecture would provide for making data easier to manage, support, and with greater quality.

Our discussion will highlight our journey and the discovery we made along the way as we developed the West Ada School District's Data Vault.

What you will learn:

  • Lessons learned from implementing a Data Vault
  • The advantages of Data Vault Modeling over other types of Warehouse Architecture
  • How purchasing software many only solve part an an organizations data needs

Speaker: Veronika Durgin

Eat, Pray, Data Vault

A Data Engineer finds herself overwhelmed and stressed out from the never shrinking backlog of data warehouse new functionality and enhancement tickets. The worry of messing up CDC and not being able to ever fully reload the very complex company's data warehouse keeps her up at night. As the depression over impossible to query data lake sets in she comes across a blog post written by Kent Graziano about running Data Vault on Snowflake.

Join me on my journey as I share with you how I learn about Data Vault, implement it in practice, and become its biggest fan.

Agenda:

  • Why choose Data Vault
  • The good, the bad, and the ugly of the methodology
  • Demo with some examples

Speaker: Michael Beach

GALAXY, The Continuous Monitoring Compliance And Reporting Solution For IT Assets.

For large organizations, monitoring compliance and reporting for computer systems can become challenging. As more solutions arise to combat modern IT threats, bridging these systems becomes even more difficult and imperative. GALAXY links data from several asset monitoring systems and provides a platform to monitor these systems and their compliance in near real time.

This presentation will highlight the trials and tribulations of developing the GALAXY system, and the value an asset monitoring system holds.

Agenda:

  • ETL challenges
  • Combining dissimilar data
  • Providing solutions to customers questions

Speaker: Keith Belanger

Is your Data Vault speaking your language?

Will be based of my Blog post - https://www.thepk.info/post/is-your-data-vault-speaking-your-language

When starting your data vault initiative, you do not want to just jump in and just start building Hubs, Links & Satellites. You need to take that top-down approach and work with your business SME’s or stewards and define the business ontology and create a business taxonomy.

No, I am not going to turn this into and in-depth dragged-out discussion on what an ontology is vs a taxonomy. But most importantly you need that enterprise-wide perspective of your business from a data perspective. Taking the time to do this exercise will give your data vault the context in a language the business can relate to and understand. Taking the time to do this exercise will affectively help you identify business keys, relationships, descriptive and changing data.

Now you do not need to place your Data Vault implementation on hold until you have done this model for your entire enterprise. But identify your core subject areas or domains and work your way through them in alignment with your business needs and roadmap of what sources of data you will be looking acquire.

What you will learn:

  • Importance of understanding your business Taxonomy
  • Taking a Top Down and not a Bottom Up Approach to you Data Vault Data Model.

Company I work for: EON Collective

On the web at: http://www.eoncollective.com

Speakers: Brett Burnam, Daniel Block

Avoiding failure while delivering value in financial services

The path to delivering valuable data products contains pitfalls both obvious and obscure. Data Vault alleviates many of the common data platform delivery failure points but introduces another set of challenges. The biggest challenge is the mindset change. Data Vault forces us to think differently about data delivery. But from legacy infrastructure to legacy org charts to legacy thinking, financial service (and other large) organizations are change adverse.

Join Daniel and Brett as they lay out an approach for people, processes, and technology informed by their successes and failures with Data Vault implementations.

What you will learn:

  • Prioritize people and processes before technology
  • Think, work and deliver horizontally
  • Fail fast, learn from mistakes and deliver business value
  • Achieve the Data Vault approach buy-in from stakeholders and performers
  • Deliver value through business sponsorship and business focus

Speaker: Doug Needham

Organically Identifying Data Model subject areas through Data Structure Graphs

By using Graph theory metrics to identify connected components in a Data Model we can use this method to identify groups of tables that should be considered together for data warehousing.

Business owners, and subject matter experts are invaluable in identifying subject areas that they need to focus on for creating data warehouse components. Using Graph theory to analyze our source data models we can enhance and verify their priorities.

Topics:

  • How to transform an Entity Relationship Diagram to a Data Structure Graph
  • Use a Data Structure Graph to evaluate organic subject areas, Entropy, and other metrics.
  • Apply these metrics to compare data models.

Speaker: Zoltan Csonka

One year journey with data vault on modern cloud data stack

Some of the most frequent requirements from our customers deal with data silos, building the single version of the truth, reducing re-engineering overhead and building up trusted data warehouses. Oftentimes, we’re explicitly asked to do integration and data modelling.

We have been using Data Vault 2.0, empowering clients to better understand their own organizations, evolve and thrive on data. We help them answer questions, such as how to create robust processes that endure changes, how to apply a modern data stack to build a data warehouse faster, how to model data to be able to handle complexity, how to integrate data coming from many different and rapidly changing sources. You will learn about our experiences during this one year period.

Topics Covered:

  • How we started with DV2.0 and why
  • Why and where we have used DV2.0
  • Challenges: good and bad experiences based on projects from the education and the e-commerce domains
  • Designing and implementing DV2 on a modern cloud stack
  • Architecture and technology
    such as Fivetan + DBT + Snowflake / BigQuery + Airflow + Lucidchart / SqlDBM / Vertabelo + Python

Speakers: Tracy McDonagh, Keith Evans

The Impact of Delivery with Data Vault & WhereScape

Amica Life Insurance’s small IT Team of three had the unique opportunity to start from scratch and modernize their data platform. After attending WWDVC and exploring WhereScape, Tracy immediately saw the advantages of the two together.

Attend this session and learn how Amica Life has been able to build out new processes with WhereScape and how they continue to rely on WhereScape to solve issues as they arise.

Agenda:

  • Introduction
  • Why WhereScape & Data Vault
  • Success with WhereScape & Data Vault
  • Conclusion

 

Speakers: Iliana Iankoulova, Bas Vlaming, Dinis Louseiro, Tiago Ferreira, Matthieu Caneil

At Picnic, we have been using DV for 5 years. Currently, we have more than 1000 DV structures in our DWH which are widely used for Data Science and Reporting layer. We have developed an in-house loading automation framework in Python and are working towards open-sourcing it.

This session will be based on our recent blog posts:

  • https://blog.picnic.nl/data-vault-new-weaponry-in-your-data-science-toolkit-e21278c9c60a
  • https://blog.picnic.nl/picnics-lakeless-data-warehouse-8ec02801d50b
  • https://blog.picnic.nl/how-we-built-our-lakeless-data-warehouse-38178f6cee12
  • https://blog.picnic.nl/the-data-engineers-role-in-the-future-of-groceries-74656881a3d6
  • Presentation at the UK Data Vault user group expert panel session “The things I wish I knew before I started my first Data Vault Project.”

We will cover many topics such as:

  • Business value of DV in supply chain and e-commerce (Iliana)
  • DV for Data Science (Bas)
  • DV in-house automation framework in Python (Dinis)
  • Demo with real data source (Tiago)
  • Open-sourcing (Matthieu)
Speaker: Michael Tantrum

Ensuring data quality through testing is critical to successful data vault implementations. Doing it right and at scale requires automation. In this session we discuss real world examples from multiple customers who are leveraging Validatar to automate the data testing processes of their data vaults, information marts, and cloud migrations.

Agenda:

  • Introduction
  • Scalability
  • Automated Testing Process

Speaker: Jonas De Keuster

VaultSpeed Customer Success: Automate the real world

There are a lot of aspects that can make or break your data warehouse project. We'd like to cover three of those using three cases from the real world: Time to market, cloud architecture and fulfilling business requirements.

Learn how VaultSpeed is speeding up the implementation process at Eurocontrol, Europes Organization for the Safety of Air Navigation, using its out-of-the-box templates. One year into the project, Eurocontrol conducted an internal ROI analysis.

We're launching a huge project at Olympus, a global player in the Electical/Electronical manufacturing market. As they are playing on a global level, they are moving their data platform to the cloud. VaultSpeeds cloud architecture fits right in.

Finally, no project succeeds without fulfilling business requirements and speaking their language. At Bank de Groof Petercam, VaultSpeed enabled developers to map their business taxonomy to the data vault model.

Topics Covered:

  • Learn how VaultSpeed can speed up the implementation process using its out-of-the-box templates.
    Templates that:

    • cover almost any combination possible for your implementation
    • are fully tested in all supported environments and every feasible combination
  • Get an idea of the ROI you get from data warehouse automation in general and VaultSpeed in particular.
  • See how VaultSpeed is a perfect fit for your Cloud Data Warehouse.
  • Learn how VaultSpeed enables you to map the business taxonomy to your data vault model.

Speaker: Serge Gershkovich

From Complexity to Control: Utilizing SqlDBM for Improved Data Platform Performance

A special guest joins us for a live interview to discuss how SqlDBM has enhanced the internal processes of their IT team and helped them to deliver improved data models to their customers over the years.

Topics Covered:

  • Introduction
  • Why use SqlDBM and what has it done for you?
  • Examples of improvements
  • Conclusion
Speaker: Dr Alexander Brunner, Dr Nicolas Fritz

Dillinger had built an "on-demand data warehouse" for its production data with many independent tables using its own ETL applications in Java over two decades. With a very small team, maintaining the warehouse took up most of the available time. After a careful evaluation of options, Dillinger partnered with Scalefree to construct a new DV2.0-based data warehouse from scratch. Using the WhereScape ETL toolchain allowed us to start the migration process with only a small number of people and continually expanding it while keeping the old data warehouse alive. In this case study, we describe the construction of a technical data warehouse based on Data Vault 2.0 and WhereScape ETL automation, the challenges, how we addressed them and the feasibility of our approach.

Topics Covered:

  • All about the case study of a DWH project in the heavy plate industry
  • How to build and operate a DV2.0 DWH with a small team
  • How to overcome challenges and avoid pitfalls
  • How to get a DWH up and running in a successful combination of an internal team with external experts

Speaker: Fabio De Salles

To have the whole enterprise running on data is the management modern Holy Grail. Data is nowadays any business’ worth its name the sexiest, hypiest, slangiest word, along with terms like Data Science, BigData, Machine Learning and the likes crowding up data’s word cloud (no pun intended!)

What is needed to have a culture to grow up to use data as its daily input? A simple answer might yet not be avaliable but the complementary question, namely what can prevent a data culture from forming, can be assessed.

This talk will show why Data Vault is the single most powerful enabler of an enterprise-wide data culture and how choosing to not have a Data Vault-based environment works to undermine any data culture initiative.

Agenda:

  • What is a Data Culture
  • What can prevent a Data Culture from forming
  • Why Data Vault can enable a Data Culture
  • How Data Vault can be applied to allow for the blooming use of data for business success

Speaker: Vincent McBurney

How to choose the right Data Vault 2.0 Automation approach for your EDW

Looking at the current state and opportunities of Data Vault Automation software based on product evaluations and project experiences. This presentation looks at the critical success factors for Data Vault and Business Vault automation. It explores the acceleration opportunities and ongoing maintenance benefits of each automation approach. It summarises the current state of Data Vault 2.0 automation tools.

Data Vault is a pattern-based Data Warehouse modelling approach that works well with metadata driven design. A Data Vault is looking for a load approach that can receive Data Vault design decisions such as choosing business keys and splitting satellites and then generate the tables and loads.

A Business Vault and Information Marts require data integration capabilities to handle complex bespoke business rules and cleansing rules which can store results in Data Vault 2.0 objects.

What you will learn:

Describe all the patterns of Data Vault 2.0 design and build and presents the pros and cons of each approaches to achieving Data Vault Automation:

  1. DIY: build your own automation framework using coding and database components.
  2. Integration Platform build: a Data Integration platform that can be used to build Data Vault patterns such as Informatica IICS, IBM Information Server or Talend.
  3. Data Vault full automation: a system that is designed from the ground up to automate Data Vaults such as VaultSpeed, DataVault Builder, ErWin or WhereScape.
  4. Look at the time and coast and the critical success factors of each approach to work out which one fits your project skills and budget. Put Data Vault automation into an estimation model covering acquisition, initial build and ongoing use and maintenance.

Examine the ways a team can deliver Data Vault and Business Vault under an agile methodology and how data vault automation supports this.

Speaker: Dan Linstedt

I will be walking you through an introduction to Data Vault.  We will cover topics that include: Architecture, Methodology, Modeling, and Implementation.  The focus of this presentation is to offer anyone with an interest, information about Data Vault itself.

Some of the topics I will cover include:

  • What is Data Vault in a nutshell?
  • Why is it important?
  • How does it help business?
  • What is it's role in the Data Lake / Data Hub?
  • How does Data Vault work with Big Data?

And a few other topics.  Anyone is welcome to come and check out this early session.

Speaker: Heli Helskyaho

Machine learning is here and it will stay. Everybody should know what it is about and where it could be used. So what is machine learning and where can it be used?

What is unsupervised learning, supervised learning, or reinforcement learning? What is deep learning? In this presentation we will talk about clustering, regression, classification, and so much more. We will talk about measuring, improving the model, feature selection, feature transformation, principal component analysis, hyper-parameter tuning and much more.

At the end of the session we discuss what else there is to learn and how to get started with machine learning.

In this presentation you learn what machine learning is all about and hopefully get so excited about the whole thing you want to learn more! After this presentation it will be easier to start learning.  This presentation is an improved version of a presentation awarded as the best presentation at KScope19 conference on Emerging technologies track.

Please note: Heli is a Certified Data Vault 2.0 Practitioner, and has been working with Data Vault for over 5 years.  This presentation is NOT to be missed!  She will tie machine learning to Data Vault across both of her presentations.

Speakers: Keith Ellis, Jas Phul, Dan Linstedt

IIBA is the International Institute of Business Analysis is a non-traditional presenter for DataVaultAlliance. Dan Linstedt will kick off the session with an introduction about IIBA and why he sees business analysis as essential and where failures are happening in implementation. Keith Ellis, IIBA’s Chief Engagement and Growth Officer, and Jas Phul, Director Product Development will talk to how to better connect projects to the business to improve business outcome.

The IIBA team will focus on the profession of business analysis and what to look for in competent analysts, core methods in elicitation and management as well as standards in business analysis. Finally, the team is going to look at using business analysis to improve outcomes specifically in data initiatives and as a foundation to data warehouse implementations.

Agenda:

  • How is the business analysis profession changing?
  • Business analysis competencies - what are you looking for in your analysis capability?
  • How do you improve business outcomes in data project?
Speaker: Heli Helskyaho

The biggest problem with machine learning is data. There is not enough data, the data is not correct, etc. Data Vault is an excellent source for machine learning since it has good quality data that has been modeled, understood, and checked with the Data Vault methodology processes.

How to do machine learning in a Data Vault database? What tools are available and how would it be best to get started.
In this presentation we will talk about these questions and see some examples of machine learning in real life.

Topics Covered:

  • why Data Vault is a great source for Machine Learning
  • how to use your Data Vault data for Machine Learning
Speaker: Scott Ambler

We live in a VUCA (volatile, uncertain, complex, and ambiguous) world. If COVID-19 has taught us anything, it is the need for our organizations to be resilient and flexible in the face of change. We've seen many organizations struggle, and we’ve seen many organizations thrive? What are the successful organizations doing that you’re not, and what do you need to do to become more effective?

This session explores how you can learn to work smarter at the individual, team, and organizational levels based on the situation that you face, even as that situation evolves around you. The Disciplined Agile (DA) tool kit provides contextualized and easy-to-navigate guidance for evolving your way of working (WoW) to improve your overall effectiveness. DA describes a robust, pragmatic, and comprehensive approach to business agility, teaching you how to respond swiftly and effectively to changing conditions. The focus of this session will be on data-oriented aspects of an organization, and how they can evolve with the times.

  • Learn how to enable to teams to work smarter via a fit-for-purpose WoW
  • Discover how to lead and govern teams that working in disparate manners
  • Understand that agile frameworks are only a start, your real goal is to become a learning organization

https://www.pmi.org/

 

Speaker: Bruce McCartney

The Business Vault is increasingly a critical component of the Data Vault 2.0 architecture. The purpose of this presentation is to discuss some of the Artificial Intelligence advances in automating 'soft' business rules.

The audience will learn various possible implementation patterns for automating business rules using Artificial Intelligence. This includes an overview of rules engines, machine learning, deep learning and causal inference taxonomies and techniques to augment the raw data into actionable insights. Additional introduction to some modelling terminology and techniques will give audience a high-level of understanding of the use of data science techniques to automate business rule development as well as some alternatives for auditing and debugging AI pipelines.

The presentation will outline various implementation patterns for using Artificial Intelligence in your information pipelines, including exposure to software tools fro development and management of these new paradigms. Finally a practical discussion of AI for augmenting data catalogs (identifying business keys) and how AI is used to forecast weather.

Topics Covered:

  • Business Vault Overview
  • Rules Engine Approach to Business Rules
  • Machine and deep learning Overview
  • Causal Inference
Speaker: Lynn Noel

Digital business requires transformation. Digital transformation depends on data. Data requires good management. Data management depends on a warehouse to store data and analytics to make use of it. Data Vault is the best choice for scalable EDW and BI, so it’s commonly found at the core of transformations for digital business. 

WWDVC is an opportunity to step back out of the weeds and zoom to the Big Picture of what we do. This presentation puts DV in context at a high level by layering two frameworks to generate insights: the industry-standard Data Management Book of Knowledge (DAMA-DMBOK) and the futurist Machine, Platform, Crowd exploring machine intelligence, big data and the sharing economy.

Data managers must innovate faster than ever to organize and govern human and AI networks of people and things that produce bigger and less structured data. What do we need to know to deliver data management differently at digital speed and scale?

  • Who are the new executive sponsors for data management?
  • What does a governance roadmap need to do in the first 3-6 months?
  • Why and how must data governance guide technical platform governance?
  • Where do you start when your data environment seems too big to know?
  • How does advanced analytics drive and disrupt data management strategy?

Topics Covered:

  • A Matter of Scale: Where Does Data Vault Fit With Digital Business?
  • DV, DMBOK, & Machine, Platform, Crowd: Layering Frameworks to Generate Insights
  • Digital Business as a Megatrend: Why Are We Building Data Vaults?
  • Machine, Platform, Crowd: Core Concepts in Digital Business
  • DMBoK 2.0 and Beyond: Digital, Disruption, & Transformation in the DMBoK
  • From Disruption to Transformation: Data Strategy Innovations
  • Call to Action: A Manifesto for Digital Business
Speaker: Richard Strange

Can Data Vault help with data-driven academic research? Academic research is following the trends of the private sector, with the dual problems of growing volume and complexity of data. As reproducibility of data-driven research comes under greater scrutiny, the need to share data sets is more important than ever, but how do we bridge the gap between current practice and the necessary capabilities in research?

Topics Covered:

  • Current state and challenges of data-driven research
  • Case studies
  • Model approaches for applying Data Vault methods
Speaker: Dan Linstedt

We will be offering the announcement of our brand new early adopter program:

We will be introducing our brand new Vendor Tool Certification Program, discussing the ins and outs of getting a tool certified.  We welcome all participants and all vendors to this session.  Come hear what is necessary to getting a tool certified.  We are happy to launch this initiative and share what our upcoming requirements will be.

The three areas of engagement are:

  • Consulting
    work for hire, simple statement of work.  You can hire DVA or one of our resources to meet with your team for a white-board or working session.  Resulting in verbal feedback for enhancements, suggestions for improvements for your tool OR your consulting company.  Remember, we can help your consulting company position, sell, and market Data Vault properly as well.  There are restrictions around leveraging this engagement type which will be discussed during this presentation.
  • Tool Certification
    This is the alpha program strictly for tool / software vendors.  We will describe this, define this, and provide a hand-out that details what's in this area of engagement and how it works.  Any vendor in the market world-wide may apply to participate, and yes, there will be yearly fees and renewals to keeping the certification status active.
  • Joint Marketing, and Lead Sharing
    This program will encompass multiple levels of marketing statements, as well as allow you to directly interact with community members.  The benefit here is: community members are already using data vault, as it is a paid community - you would receive warm leads.   If the API or lead-sharing components are selected, we can extend our platform to send you leads (IF they opt-in by the community members), and even connect directly to your API or your CRM or both through automated B2B integration.
Speaker: Fabio de Salles

"It is possible to commit no mistakes and still lose."

There were no other way to build traditional Data Warehouse at the beggining than using waterfall-like project management. As time passed and projects accumulated dysfunctions and waste it became clear something better was needed. Hence came the Agile Manifest rewriting the playbook to what success was and how to achieve it. But old projects die hard. The more apparently successful it is the most it resists change and stays unchallenged on duty. This talk will present one such case: The curious impact of applying Data Vault on legacy DW+Waterfall running projects.

The project was running for a couple of years, using the most expensive and advanced technologies and processes like Oracle, Dimensional Modeling and a host of other such items. Initially high successful the project delivered apriori reporting to customers of a niche ERP. As time passed the increased complexity carried by Dimensional Modeling ended up bringing the progress to a screeching halt: ETL time surged to half a day (it at all completed), reports waited in line waiting for a developer to tack them, while data errors spiked around. Customers were starting to find ways around and costs took the profits out to a walk, a long, seemingly unending walk.

Finally deciding to reign in such mess the company's CEO contracted a Data Vault consulting to rise the project to profitability again. And something totally unexpected happened: The project reached so great a success it died out!

How can DV make a heavy and busy Data Warehouse project so successful and result in dissolving it into thin air? Attend to the presentation to know the full story on how you can make no mistakes and still fail.

(Stay through for this is a happy ending story, with lots of profits reaped after all!)

What you will learn:

  • How Data Vault can rescue problematic projects;
  • What happens to DW projects when productivity rises continually;
  • The value of teams.
Speaker: Nols Ebersohn

Have you ever considered the Laws of Nature in our physical environment? You know, little things like - what goes up must come down? Otherwise known as the law of gravity. There is the laws of thermodynamics and many others.  The interesting phenomenon of laws of nature is that it is true in both the positive and the negative. For instance: What goes up must come down. It also survives the negative of: What goes up cannot stay up indefinitely.

So what are these laws of nature for information? What are the consequences of these laws of nature on our design? How does it change the way we approach information management and in particular, how and when do these laws affect your Data Vault?

These are perplexing questions! Given the universality of the laws of nature, if you violate these laws, it tends to result in unwanted or unexpected disasters. This discussion will explore these laws and the impacts on design and the relevance to your data vault deployments.

What you will learn:

  • What are the laws of nature of Information?
  • What are the effects of these laws on information design?
  • What are the effects at a Model level?
  • What are the effects at an architectural level?
  • What are the consequences of violating the laws of nature pertaining to information?
Speakers: Bill Inmon (WH Inmon), Francesco Puppini

The Unified Star Schema is a new approach for Information Marts. The data model is a single star schema, and it is centered on a table called the “Puppini Bridge”. This central table does not depend on the business requirements. The methodology to build the Puppini Bridge is repeatable, pattern-based, and ideal for automation.

The Unified Star-Schema does not affect existing Data Vault 2.0 constructs, but is a new way of looking at classic dimensional modeling on the way out of the DW.

It handles not only the classic Fact-Dimension relationship, but also the multi-fact queries, as well as the scenario of non-conformed granularities. It is ideal for self-service, because it’s extremely easy to use, also for business users.

We will start by presenting a brief history of the Data Warehouse. Then we will illustrate a problem that exists today in the traditional Dimensional Modeling: the redundancy. We will show how this approach reduces the redundancy to zero.

This solution has been tested in several client implementations, but not yet on massive data volumes. The open question is: will it scale?

Topics Covered:

Why redundancy happens
How redundancy can be avoided
Speaker: Hannu Jarvi

Machine Learning / Artificial Intelligence is largely about recognizing patterns in data.
You probably have seen, on TV news reports, stories about crime and noticed how the accused perpetrator’s or victim’s face has been blurred or pixelated.

The resolution has been lowered to hide the patterns that make that person recognizable. Something very similar happens when the granularity of data changes in traditional Data Warehousing. The data is “pixelated”. When Machine Learning is no longer able to recognize patterns in the data as a result “pixelated” data; that Information is lost forever.

Data Vault is the first Data Warehousing paradigm that seeks to preserve all information, thus it is also the first paradigm that provides a solid foundation for ML/AI.

The Machine Learning pipeline itself creates a lot of new information. In addition to the end result – for example, an evaluation of an X-ray - it provides a lot of intermediate information: Alternative explanations with varying probabilities, different results with different parameter values, etc. - information that may eventually prove valuable when analyzed against actual outcomes.

The volume and complexity of this kind of information may grow exponentially. A Data Lake, the data management workhorse of ML/AI community, cannot deal with this complexity. The information is “Lost in the Exhaust” of the ML pipeline.

This presentation explores ideas for capturing ALL value created in ML/AI pipelines. The ideas are based on challenges faced by some leading ML scientists in Healthcare and Pharma, and present the opportunities provided by Data Vault 2.0 for solving these challenges.

Topics Covered:

  • Why AI / ML needs pristine data in its original grain
  • Why DV is an ideal foundation for AI / ML
  • Why Dimensional Models don't work for AI / ML
Speaker: Stephen C Moon

Learn how to leverage Data Vault to enable your DataOps pipeline to manufacture and distribute data and analytics products.

Enterprise Architecture
- Vision, Current State, Future State, & Roadmap Development
- Value Proposition Design
- Business and Operations Modeling
- Business Process Decomposition / Modeling
- Cost Modeling

Data Operations (DataOps)
- Acquisition, Processing (Curation and Enrichment), & Distribution
- Organizational Consulting (People, Process, & Technology)
- Data Architecture & Engineering
- Data Vault
- Data Modeling
- Data Warehouses and Data Lakes
- Batch; Stream; Extract, Transform, & Load (ETL)
- SQL and Spark
- Business Intelligence (BI), Analytics, and Machine Learning (ML) Enablement
- Security and Compliance

What you will learn:

  • Business Intelligence (BI), Analytics, and Machine Learning (ML) Enablement
  • Business and Operations Modeling
  • Value Proposition Design

 

Speaker: Gabor Gollnhofer

This is a case study on a DW transition from “traditional” (a hodgepodge of 3NF with history, dimensional and flat-wide tables) to Data Vault based. This Data Warehouse (in its first “incarnation”) was built almost 20 years ago. Since then its major parts were reorganized three times. Because of changes in internal & external environment we had to change our ETL tool (again). Finally, we’ve decided to do a complete reorganization including our ETL & modelling tools, development processes and to move to Data Vault based modelling.

Of course, we had to do this in a short period with limited internal & external staff. And the old and new DW had to work together so that the major processes could be migrated one-by-one.

Topics Covered:

  • Why we’ve chosen to use Data Vault
  • What were the pain points, pitfalls and solutions
  • What are the results & what we’ve learnt during this transition

Beware! This is not the usual "bright & shiny" presentation, I'll share some "worst practices" on the development & operations of the DW. And if the audience wants to participate, we could also exchange our "horror stories" during the Q&A ;-)

Speaker: Michael Lutz

Fox Chase has embarked on a journey to re-engineer our Data Warehouse internals with the Data Vault 2.0 Methodology. DV 2.0 strongly emphasizes metadata driven automation and we are attempting to follow the methodology in its entirety. The technologies employed by our solution include Python, Jinja templates, Airflow, and Oracle. The Jinja templates defining the nature of the output can be re-expressed to support other target database technologies.

We will examine all aspects of the solution with the goal of defining a roadmap for others to implement DV 2.0 with this open source toolset. Additional discussion points include Data Vault design decisions in the Healthcare vertical, the motivation for Fox Chase deciding to take these steps, and an engineer’s perspective on tackling the learning curve of these technologies.

What you will learn:

  • How to implement DV 2.0 with open source technologies: Python, Jinja, and Airflow
  • Walk through the experience of making DV 2.0 design decisions in the Healthcare vertical
  • Motivations for migrating to the DV 2.0 Methodology
Speaker: Dan Linstedt

Its that time again, time to wrap up the WWDVC 2020 conference, and bring it to a close.  I will be talking about the things we saw, things we learned, and potentially next years' conference as well.  I look forward to all of you that can make it to closing remarks.

Thank you, and see you next year!

Get Started Today

Connect with other professionals, unlock a world of unparalleled opportunities. Elevate your expertise and access tons of self-paced e-learning content. Join your peers in our dynamic community, enhance your skill sets today!
Read More
Shopping Cart