SQL dimension mapping gives us the full power of SQL to customise how a dimension is defined, allowing for flexible and tailored reporting across datasets.
In most cases, where possible, it’s best practice to define complex dimension logic using rules and custom fields within a lookup table. This approach ensures that logic is created once and can then be reused across any number of data tables. It keeps things scalable, consistent, and far easier to maintain over time.
However, there are scenarios where this isn’t possible, particularly when a dimension needs to be derived directly from a field within a raw data table. In these cases, we rely on SQL dimensions, where logic is applied directly to the source field.
Common Use Cases
1. Grouping Values with CASE WHEN
One of the most common use cases is grouping values into more meaningful buckets.
For example, when analysing campaign delivery by age group, raw platform data may contain fragmented or inconsistent age ranges. Using a CASE WHEN statement allows us to standardise these into cleaner, more usable groupings.
CASE
WHEN age IN ('55-64', '65+', '55-100') THEN '55+'
ELSE age
END
This is particularly useful when combining data from multiple platforms (e.g. Meta and TikTok) where dimension values don’t naturally align.
2. Extracting Parts of a String
SQL functions such as LEFT(), RIGHT(), and SUBSTRING() are useful for extracting specific parts of a text field.
A common example is working with web analytics data, where you may want to extract parts of a URL for analysis — such as page paths, categories, or product identifiers.
SUBSTRING(page_url, 1, 20)
This allows you to reshape raw text fields into structured dimensions that are easier to analyse.
3. Cleaning and Standardising Labels
Often, raw data contains prefixes, suffixes, or inconsistent formatting. SQL functions can be used to clean and standardise these values.
For example, transforming TikTok age labels:
REPLACE(REGEXP_REPLACE(age, '^AGE_', ''), '_', '-')
This converts values like:
AGE_18_24 → 18-24
AGE_55_100 → 55-100
This kind of transformation is key when preparing data for cross-platform reporting.
4. Calculating Time Differences
Another powerful use case is calculating the time difference between two events using timestamp functions.
For example, measuring the lag between a user’s visit and their conversion:
TIMESTAMPDIFF(HOUR, visit_time, conversion_time)
This enables deeper analysis into user behaviour, such as conversion latency or time-to-purchase.
SQL dimension mapping is a powerful tool when working directly with raw data fields. While lookup tables should always be the preferred approach for reusable and scalable logic, SQL dimensions provide the flexibility needed when transformations must be applied at the source level.
Used effectively, they allow you to:
Standardise inconsistent data
Create meaningful groupings
Extract and reshape raw fields
Unlock deeper analytical insights
