Case Study: Data Engineering

Analytics Capabilities Assessment

Transportation


Public transit organizations are increasingly recognizing the importance of being data driven in decision making. A major challenge is the fact that data is often fragmented across disparate data systems. This creates challenges in making operational decisions with the full context of various operational systems. Furthermore, data that is siloed increases operational maintenance of the data and creates complexity when data integration efforts are undertaken.

The Challenge

Our client wanted to assess the state of analytics capabilities within the organization as well as the effort required to implement an analytical data mart (ADM).

The Solution

Analytics Capabilities Assessment

To evaluate the client’s state of analytics capabilities, our data specialists followed Optimus SBR’s 6 Step Business Intelligence & Data Assessment Approach. This assessment process helps define what insights dashboard reports must deliver; where the data comes from; and how to prepare, load, and store the data in a way that optimizes insights in a data visualization tool.

The assessment process conducted by our data team for the client can be summarized as follows:

1.  Stakeholder Alignment – aligns expectations of all key SMEs and stakeholders

2.  Identify Source Data – identifies, scopes, inventories, and prioritizes data sources

3.  Elicit Requirements – captures business requirements and determines analytical capabilities needed to meet stakeholder reporting requirements

4.  Align Data & Requirements – maps source data to requirements and documents missing data and sources

5.  Detail Infrastructure – defines infrastructure required to achieve analytical needs

6.  Present Final Findings & Close-out – provides recommendations and costs for proposed infrastructure acquisition to build ADM

Roadmap & Implementation Plan

After conducting the 6 Step Business Intelligence & Data Assessment, our data team developed a roadmap and an implementation plan for the construction of a 3-Tier ADM (Analytical Data Mart).

The 3-tier ADM infrastructure delivers performance and analytical advantages such as ensuring optimal data accuracy, simplifying data access, and maximizing insightful analysis. The ADM solution also provides the flexibility and scalability to support reporting enhancements.

The tiers of an ADM are comprised of landing, integration, and analytical databases. In addition, ETL (Extract, Transform and Load) scripts run between each of the databases to integrate and prepare the data for output in the top analytical layer.

ADM Internal Infrastructure

Tier 1 – Landing

ETL (Extract, Transform, Load) ingesting raw source data

Database staging area to apply business rules and initial optimizations

 

Tier 2 – Integration

ETL restructuring data to minimize space requirements, align data between source systems, and enforce data quality

Database highly optimized storage of data to remove duplication of data and conserve space

 

Tier 3 – Analytics

ETL restructuring data to optimize data structure for analytical performance

Database highly tuned report access infrastructure to simplify and optimize performance for business users

 

The Results


  • Summary of the Data Estate - The data streams that were identified to be in scope for the assessment were documented and described. Additional documentation regarding the data estate was provided such as the source data inventory document, a source data dependency map and volumetric documentation.
  • Required Data Disciplines to Build an ADM - The required data disciplines to construct an ADM were documented. These included but were not limited to data engineering, data architecture, data analytics, infrastructure, and quality assurance.
  • Rough Estimate of Effort and Schedule - A rough order of magnitude estimate was created based on the considerable amount of detail captured in the assessment process. This document indicates the effort and investment required to create an ADM.
  • Recommendations - The roadmap to construct an ADM identified short-term, mid-term, and long-term recommendations to improve the client’s analytic capabilities.
Assessment defined insights reports must deliver, where the data comes from, and how to prepare, load, and store the data to optimize insights
ADM (analytical data mart) would provide analytical advantages such as optimizing data accuracy, simplifying data access, and maximizing insightful analysis
Effort and investment required to create an ADM (analytical data mart) was determined
Provided a detailed roadmap to achieve improved analytical capabilities that will facilitate data driven decisions