The Fastest Cubes in the East
The Fastest Cubes in the East
Around 2010, OLAP cubes were a common way to support analytics. They were the fastest way to do interactive analysis over a fixed set of "dimensions".
One of the popular solutions was Microsoft's SQL Server Analysis Service, or SSAS. SSAS was well integrated with the microsoft ecosystem (SQL Server, SSIS, etc) and was a popular choice for many companies.
I was working at a company that used OLAP cubes for core marketing optimization tasks. One of our biggest challenges was ensuring that cubes updated early enough for analaysts to update bids as soon as they came in in the morning, while ensuring the cube was always available.
As the cube sizes grew towards 1TB and beyond (Yahoos 24TB cube being a frequent nightmare scenario), we pulled the typical levers for improving cube processing time and query performance:
- Heavily leverage incremental updates
- Schedule expensive full updates on the weekend
- Use a storage based snapshot model to ensure that the cube was always available during the update process
- Use multiple frontend servers to serve the same processed cube
- Split our backend process across multiple SSAS servers
Where else to optimize?
The other lever we had was to start our processing operations earlier. This was a challenge as updating a cube was downstream of a tangled processing graph responsibily for creating all the cube dimensions, using those to create fact tables, and then finally kicking off processing. Manually resolving the optimum processing order was challenging and we suspect that we were leaving processing time on the table.
The Solution
The solution was to remove any form of manual dependency analysis from the equation. We created a system that did the following:
- Parsed a cube XMLA on commit to extract all table dependencies
- Resolved each table dependency to a set of source tables (and partitions!)
- Created a directed acyclic graph (DAG) of all tables and their dependencies and every cube processing task (dimension update, partition update)
We were bottlenecked by needing to run all the cube dimensions at once, so that becamse the first optimization criteria. After that, we'd want to unblock each partition update step as soon as the underlying SQL data was updated, and we'd run these jobs in parallel.
This system expanded to cover all of the dozen cubes we were running.
Implementation
Our parsing and orchestration system was built in python and run as a service. We used a mix of shell calles to powershell and python ado.net bindings through pythonnet to communicate with the cube servers.
Optimization
The system reported the long-pole path for the overall cube update, which could be determined by just converting the DAG to a tree and weighting by runtime to get the longest path. This enabled teams to focus on the most important optimization paths