Positioning WhereScape as a trendsetter in the data warehousing industry

The data warehousing space is changing, and WhereScape, a well-established player in the data warehousing field, wanted to deliver a ground-breaking product to market. This was a complex, greenfield, highly technical project where I had sole charge of the UX/UI of the new product. The result is a product that generated sales and has helped WhereScape win several awards.

Screenshot of the Data Vault wizard
Client
WhereScape
Role
UX/UI design
Interaction design
Front-end development (UI)
Copy writing
Date
2016-2018

Background

WhereScape is a well-established name in the data warehousing space. Their software and solutions help organisations around the world to centralise and access their data faster. WhereScape's data warehousing software takes data from multiple sources and consolidates it into a single repository that can then be queried to generate reports for business intelligence.

Challenge

Traditional data warehouses are updated daily (often overnight) but the emergence of real-time sensor data, IoT devices and cloud based solutions require almost instantaneous updates to provide up-to-the minute business insights. WhereScape wanted to develop a modern solution that was capable of handling real time data streaming while still offering the automation capabilities the company was renowned for, allowing data collection, manipulation and storage, across multiple sources, in a matter of seconds.

The traditional approach

Businesses use data warehousing to consolidate information about different systems into a single source. This allows businesses to get more insight into their operations, and to analyse data over a period of several years in order to detect significant trends that can then be translated into actionable business plans.

Real-life examples

A retailer that has multiple outlets across the country. By analysing the sales, stock and staff performance data the retailer will be able to see which stores are the best performers and what products sell best at specific locations.

The future of data warehousing

With the emergence of real-time sensor data, IoT devices and cloud based solutions the traditional approach of updating a data warehouse overnight is no longer sufficient. These businesses require almost instantaneous updates to provide up-to-the minute business insights. This is particularly critical for systems that rely on real-time data to identify anomalies.

Real-life examples

A company that performs real-time checks on credit card transactions to identify fraud, or a truck manufacturer that is able to identify a critical error and stop the driver before an accident happens.

Process

This was a greenfield, very technical project that was being developed for an emerging market. Working on the project required learning more about databases, how they work and their structures, and the rules of specific data warehousing methodologies. It also involved being keenly aware of implementation details and constraints. Getting to know the field meant learning more about the company and the business logic surrounding the problems  I wasn't a domain expert but this turned out to be useful to bring a new perspective.

One of the many whiteboard sessions that happened with the team  this one in particular was about database foreign keys and how they work.
Photo of a whiteboard session

Early days

I was heavily involved in the early stages of scoping and requirement gathering, and worked closely with the product owner and project manager during the phases of scoping and defining business requirements. The team (including developers) would often get together to sketch out our understanding of certain parts of project and to make sure that we were all on the same page.

Whiteboarding and sketching sessions with the team were a staple during the whole process. They helped inform the quick prototypes I created in order to test assumptions with team members, to define the information architecture and to visualise how the product could look.

Understanding the users

Working on a product ahead of its time and early to the market proved to be challenging in terms of user research. Fortunately WhereScape had an in-house consulting team and I was able to conduct guerrilla research (i.e. hijacked lunches) and to interview the current users of WhereScape products in order to understand their traditional workflow and what pain points existed in current products.

Application flow

The discovery stage was an intense process where the team and I discussed our ideas about the application and cemented the shared terminology we were using. I spent a fair bit of time getting familiar with the problem and domain knowledge and thinking about how we could transform what were, at that stage, abstract ideas, into something concrete.

Making sense of it all. An early user flow on the left. Although the application grew and became more complex with time, the fundamental flow stayed the same. On the right, an early attempt at defining the information architecture of the application, and a summary of different areas of the application and actions that could be performed.

User flow Different areas of the application and the actions that can be performed in them

Information architecture

As we progressed in the discovery stage, connections (the sources and destinations of data) and dataflows (containers of data) emerged as the main drivers of the application.

Connections

The sources of data and the destination endpoints of data transformed by users.

Dataflows

The containers of data inside the application. Data (tables, scripts) inside dataflows can be manipulated and transformed.

The first wireframes and low fidelity prototypes reflected this understanding. These were also my first attempt at visually representing what we were building, and were used to test the proposed flows with stakeholders and team members.

Although the earlier research identified connections and dataflows as the two main drivers of the application, reducing it to only those two components was not feasible. I separated the application into six main logical areas, based on the actions users had to perform in each of them. This approach allowed us to show or hide these logical areas based on a users' permissions and proved solid when we had to add an extra area that wasn't part of the initial scope.

Setup

User and platform management area.

Connect

Configure sources and destinations of data.

Discover

Browse and import data contained in connections.

Design

Design the structure and flow of imported data, modify it as needed and set its destination.

Deploy

Execute the abstract models created in the Design area to create concrete data.

Monitor

Monitor the status of deployed models and their data.

Documentation

Self-generated metadata about all items inside the project for auditing purposes.

Building up the user interface

The new product was built on web technologies, and this allowed us flexibility to move away from traditional desktop software interfaces and deliver a modern looking product. This opportunity also had its challenges. I had to investigate how we could translate existing interaction paradigms of desktop apps onto the web, but often a direct translation wasn't the right choice  e.g. double clicking to select items.

Fortunately web based technologies offered us the flexibility to come up with our own interaction paradigms. We were able to build different views of the application for different use cases  some form heavy, others more visual  and blend them together, while at the same time keeping a consistent look and feel.

Development underway

With the project moving fast and the development team one sprint behind, the interactive prototype developed in Axure quickly morphed into the UI design reference for the developers. The final polish was added directly by tweaking the styles and markup of the web application  I often committed code to version control in order to ensure the consistency of the look and feel, and also to make changes to the text the application.

We were originally using Twitter Bootstrap for the application styles but as we progressed it became more of a hindrance. We ended up creating custom styles instead and although this meant more work upfront it provided us with much needed flexibility. Below, some of the styles I developed for form elements.
Screenshot of code styles

Documenting design decisions

The prototypes (developed in Axure) quickly evolved and became more complex. They were soon being used to identify technical constraints as soon as possible and avoid costly changes at development time. The prototypes were also use to demo functionality to future users and stakeholders.

While the prototype was good for visualising interactions, the proposed functionality and interactions weren't always clear, so I started documenting them in Confluence  the project wiki. Although this was time consuming, it became invaluable to ensure the team members were all on the same page.

User interface & experience challenges and solutions

Viewing the data and modelling its flow

Data warehousing involves manipulating data  tables and records  through the use of transformations  scripts. Data may have to go through several transformations until it fits a format adequate to generate business intelligence reports.

One of the biggest UX challenges was how we could allow users to inspect the data and the transformations it went through along the way, as well as the relationships between and within the data.

Traditional applications in this space present the data in a list view, and each time a user drills down one level, for example, to view the column definitions inside a table, context is lost. Additionally, the transformations the data has gone through are often hidden.

The UI of WhereScape RED, one of the established tools in the traditional data warehousing space. It follows the convention of panels at the bottom and sides, with a main area in the center. Actions are either on bottom bars, context menus or in the main application menu.
Screenshot of RED

During whiteboard sessions the team always ended up defaulting to drawing the flows of data in a diagram fashion; it soon became obvious that this was the natural way of presenting and manipulating flows of data. Conversely, we always defaulted to a list view when we were discussing the properties of the data.

Our solution involved a hybrid approach, where we displayed the transformations the data had gone through in a diagram view, but provided extra information about the structure of the data in a side panel. The detailed view allows users to drill down several levels while still keeping context of where they are by having a breadcrumb trail at the top of the panel.

Large diagrams soon become unreadable and were, from a technical point of view, a performance concern. Solving this challenge required researching successful node based interfaces, their features and analysing how they overcame the hurdles we encountered.

The solution included incentivising users to break their data structures into smaller, more manageable chunks as soon as they start modelling the data. The diagram itself has features informed by the earlier research, such as search, filtering functionality and a mini-map.

Screen recording of a user interacting with the diagram. The filtering functionality is used to remove scripts and transformations, leaving only the data containers.
Recording of a user interaction with the diagram

Understanding the technical implications of design decisions

We researched several out-of-the-box software libraries for the diagram functionality but none provided the functionality we wanted  the ability to show and collapse nodes and display their children in a table-like format. Fortunately the development team was on board with the vision I had for the diagram and customised the rendering functionality of an existing library to look and behave the way we wanted.

Going with a custom renderer meant creating the templates for the nodes in Illustrator, carefully naming each of the component layers and exporting them to SVG. I worked closely with the developers to understand what technical constraints we had and documented the nodes' structure in Confluence, to ensure that further changes wouldn't break the diagram.

Screenshot of some of the nodes and their structure in Illustrator.
Screenshot of the nodes structure in Illustrator

Automating the data warehousing process

Data warehousing methodologies can be hard to understand and are complex to implement from scratch. To make matters worse, there are several competing methodologies and each follows different implementation patterns.

One of the data warehousing methodologies, called Data Vault, is synthesised in a 700 page book.

Data Vault book cover

We removed this high barrier to entry by allowing users to grab data from different sources and use it as a starting point for their model of the data stream. The model can easily be constructed by following guided steps which abstract away the complexity of data warehousing methodologies, allowing users to focus more on their model and less on the technical details of the implementation.

You can read the book or use WhereScape Automation with Streaming to create a Data Vault by following six clear and succinct steps. Below is a screenshot of one of these steps.

Screenshot of a step in the Data Vault Wizard

Results

The result is a product that was released to the market less than a year since its very inception and generated sales straight away. WhereScape Automation with Streaming is a modern solution that unleashes the power of data warehousing automation and reduces the time to create a data warehouse by 90%.

Screen recording of a demo of an earlier version of the product, showing how users can get a functional data warehouse built in five minutes.

Positioning WhereScape as a trendsetter

WhereScape Automation with Streaming has definitely put WhereScape on the map of real-time data streaming. Since the release of the product WhereScape has been:

Top 5 Vendors to Watch

Recognised as one of the "Top 5 Vendors to Watch" in the third annual Datanami Readers' and Editors' Choice Awards.

Trend-Setting Products in Data and Information Management for 2019

Chosen as one of the "Trend-Setting Products in Data and Information Management for 2019" by Database Trends and Applications.

Best Cloud Automation Solution

Shortlisted for the "2018-19 Cloud Awards" under "Best Cloud Automation Solution".