Skip to main content
Log inGet a demo

What is a Customer Data Platform (CDP)? The Complete Guide

A guide to understanding Customer Data Platforms (CDPs).

Luke Kline.

Luke Kline

April 22, 2024

19 minutes

What Is a Customer Data Platform (CDP)?

There is a lot of confusion about what it actually means to be a Customer Data Platform (CDP). More technologies are in the space than ever, and new vendors are popping up all the time. As such, the MarTech world has become increasingly complex and difficult to navigate. It’s never been harder to understand the CDP market.

This blog post will cover:

  • What is a Customer Data Platform?
  • Types of CDPs
  • How Do CDPs Work?
  • Why CDPs Were Created?
  • CDP Use Cases
  • CDP Implementation
  • Should You Buy a CDP?
  • How to Choose a CDP?

What is a Customer Data Platform (CDP)?

A Customer Data Platform, or CDP, is a solution or architecture that enables you to collect, store, model, and activate your customer data. The entire purpose of a CDP is to provide a centralized platform where you can create unified customer profiles and build personalized experiences for your customers.

Customer Data Platform Architecture

How Data Flows Through a Customer Data Platform

Customer Data Platforms help you collect first-party data and consolidate that information into a central database. All CDPs offer features for both data teams and marketing teams, which solve two key functions:

  1. They help your data teams collect, unify, and move data more efficiently between systems.
  2. They enable your marketers to build self-serve audiences and send them to their other tools without requiring engineering resources.

What’s the Difference Between a CDP and a CRM?

A CRM, or Customer Relationship Management platform, is quite different than a CDP. Whereas Customer Data platforms are marketing tools specifically designed to collect and manipulate customer data, CRMs act as relationship brokers for your customers. CRMs like Hubspot and Salesforce primarily focus on managing operations like sales opportunities, contact details, support tickets, purchases, service history, etc. CRMs help you manage individual interactions, and CDPs help you build and analyze audience cohorts for marketing activation. CRMs tell you what your customers are doing, and CDPs help you understand who your customers are.

What’s the Difference Between a CDP and a DMP?

Whereas CDPs specifically focus on first-party data, Data Management Platforms or DMPs manage third-party and second-party data. DMPs specialize in digital advertising use cases because they help you aggregate and segment audiences for targeting using anonymous data. A DMP is essentially an advertising tool that helps you optimize your paid media spend by identifying lookalike audiences using anonymous identifiers. The key difference between CDPs and DMPs is that DMPs don’t actually store any PII data, and the data stored in the platform is only housed for a short duration; CDPs tend to house data for a longer period of time (usually 1-3 years).

Traditional CDP vs. Composable CDP

Download our 2-page comparison guide

Types of CDPs

According to the CDP Institute, there are four main categories of Customer Data Platforms: Data CDPs, Analytics CDPs, Campaign CDPs, and Delivery CDPs. However, the problem with this definition is that it doesn’t account for the underlying architectural differences and instead simply groups CDP solutions by use cases. There are many other specialty categories focused on other features like event tracking, identity resolution, and data onboarding (to name a few.) If you bucket CDPs by use cases, differentiating between vendors is very difficult.

The factor that truly separates CDPs from one another is the underlying architecture. Every CDP platform will have a bit of bias or nuance towards a certain industry or use case, but generally, there are three main CDP solutions or architectures: traditional CDPs, Composable CDPs, and Hybrid CDPs.

Traditional CDPs

A traditional CDP is a packaged solution designed for collecting, storing, modeling, and activating customer data. This type of CDP operates by hosting and managing the data within its own system(s).

Traditional Customer Data Platform Architecture

Traditional Customer Data Platform Architecture

Composable CDPs

A Composable CDP is an unbundled solution that collects, models, and activates customer data from your existing data infrastructure. This type of CDP stores no data and instead integrates with your existing data assets, allowing you to avoid long implementation times and unlock a much higher degree of flexibility.

Composable Customer Data Platform Architecture

Composable Customer Data Platform Architecture

Hybrid CDPs

A Hybrid CDP is a mix of the previous two solutions. All of the features of a CDP are bundled into the platform, but the architecture has some backward compatibility with your data warehouse. However, the technology is very undeveloped, and many vendors rely heavily on data copy processes, which can introduce huge latency problems and also create duplicate storage costs because you have to pay to store the same data twice (in both your data warehouse and your CDP.)

Infrastructure CDPs

Infrastructure CDPs are data management platforms tailor-made for data teams. Whereas the other CDP types place a large emphasis on marketer-friendly features around audience management and activation, infrastructure CDPs support more upstream use cases like event collection and identity resolution. The entire premise of these platforms is to provide data teams with a suite of tooling to manage data pipelines to create a strong foundation for customer data. Marketing Clouds

Marketing Clouds

Marketing clouds are extensive product suites offered by large software companies like Salesforce, Adobe, and Oracle. These companies bundle various marketing and data management products into larger CDP-specific offerings. The key difference with these platforms is that they’re not designed to integrate with external tools and technologies but rather operate within their own “walled garden” or ecosystem. The entire premise of a marketing cloud is to help you consolidate and manage your customer data so you can then leverage it across other marketing tooling within the marketing cloud’s ecosystem.

How Do CDPs Work?

CDPs provide a managed platform where you can connect to data sources to collect data and then automatically route that data as events or audiences to the downstream operational tools of your business. Every CDP has four basic components: event tracking, identity resolution, audience management, and Data Activation.

Event Tracking

All CDPs provide out-of-the-box software development kits (SDKs) that you can instrument in your codebase to track specific events your customers are taking or unique traits about them. Once you’ve deployed an SDK on your website or mobile app, every time a user takes an action (e.g., add-to-cart), that event is fired and stored in your CDP. However, most traditional CDPs have a strict event spec that limits what data you can collect, and the schema structure also imposes restrictions on how you can store that data as well.

Event tracking data flow

How Event Tracking Works

Identity Resolution

Identity resolution is a critical feature of any Customer Data Platform because it allows you to unify different customer datasets across different ingestion channels. CDPs provide proprietary identity resolution algorithms that you can use to link data from different channels and create a unique identity graph for each of your customers to show every historical action they’ve taken and link those actions back to an individual customer.

For example, if a user visits your website and then returns later and purchases a product, you can use identity resolution to stitch those two sessions together under one unified profile. The downside to this approach is that you don’t actually own your identity graph because it’s stored in your CDP. Additionally, because CDPs are largely limited to clickstream, you can’t easily leverage other data sources or custom entities that only live in your data warehouse.

Identity resolution in a customer data platform

How Identity Resolution Works

Audience Management

Without audience management, a CDP is just “Customer Data Infrastructure.” In order to actually make the insights available within the platform useful, CDPs come equipped with a visual user interface and audience builder. This interface allows you to build and define customer segments and personas without writing SQL. However, with traditional CDPs, your audience building is usually limited to behavioral data, and there is no easy way to leverage proprietary data science models that only live in your data warehouse around things like customer lifetime value, purchase propensity, or even personalized product recommendations.

Audience builder in a customer data platform

How Audience Management Works

Data Activation

The final component of any CDP is the actual movement of your data. CDPs wouldn’t be useful if the data solely stayed in the platform, so CDPs are designed to integrate with various operational tools. For many marketers, this includes ad platforms, lifecycle marketing tools, or even CRMs (basically any platform where you interact directly with your customers). The value here is that CDPs automatically integrate with various third-party APIs, so your data team doesn’t have to build and maintain brittle pipelines to try and move data. This means all you have to do is define what data points or attributes you want to sync to your destination.

Data activation from a customer data platform

How Data Activation Works

Why Were CDPs Created?

Most people don’t realize that many Customer Data platforms were created by accident. Basically, every major CDP vendor available on the market today evolved into the category. Most of the platforms started as CRMs, infrastructure tools, databases, tag managers, email tools, marketing automation systems, or even Reverse ETL platforms. Eventually, all of these SaaS platforms realized the same thing: building and maintaining a persistent customer record is difficult. Subsequently, every platform developed a very similar suite of features, and the CDP category was born.

Pedram's law of Customer Data Platforms

Before CDPs existed, managing customer data was really difficult. Not only did you have to set up your own internal processes to collect your data, but you also had to ask your data team to build and maintain custom integrations and pipelines to your operational tools to ensure that data was available to your business teams.

CDPs solved a key challenge in that they introduced a single, unified customer database where you could automatically collect, model, and sync data reliably at scale to your operational tools. The platforms saw major adoption because they offered a number of marketer-friendly tools that helped make data self-serve. The gap that had previously existed between your data teams and marketing teams was shortened because data teams didn’t have to spend their time managing brittle pipelines, and marketing teams didn’t have to wait to build and launch personalized campaigns. The platforms provided an interface for data teams to manage pipelines and a self-serve UI where marketers could build and manage audience cohorts for activation.

Customer Data Platform (CDP) Use Cases

While the lofty promise of Customer 360 is one of the main driving forces for all CDP adoption, at a broad level, there are two main reasons to adopt a CDP:

  1. You want to offload engineering work from your data team and adopt a managed platform that can collect and move data between systems efficiently at scale.
  2. You want to give your marketing team access to self-serve audience tooling so they can launch and test marketing campaigns faster and deliver more personalized customer experiences.

Underneath these two pillars, there is a large list of use cases like:

  • Event Tracking: Capturing behavioral actions like page views, purchase events, signups, etc.
  • Identity Resolution: Creating unified customer profiles to better how your customers are interacting with your brand via an identity graph.
  • Audience Management: Segmenting and targeting specific users based on various attributes like purchase history or specific user traits like age, gender, location, etc.
  • Personalization: Serving personalized recommendations or dynamic content on your website based on purchase history or viewing habits.
  • Advertising: Uploading a list of customers to Google or Facebook so you can retarget shopping cart abandoners or identify potential lookalike audiences.
  • Lifecycle Marketing: Building personalized customer journeys across multiple marketing channels like SMS, email, push, etc.
  • Data Enrichment: Enriching your operational tools like Salesforce or Zendesk with additional insights so your business teams can be more effective.
  • Analytics: Measuring campaign performance across channels by analyzing customer behavior or comparing and contrasting audience overlaps or specific user traits.

These are just a few examples, but technically, there’s no limit on the number of use cases that a CDP can support. However, given that packaged CDPs are largely limited to behavioral events, many companies are now transitioning to a Composable architecture, which offers greater flexibility, interoperability, and a far lower cost of ownership because there is no duplicative data storage when you integrate with your existing data warehouse.

How Much Do CDPs Cost?

Traditional CDP pricing is often based on monthly tracked users (MTUs) or users who generate events. The overall cost is directly linked to two factors:

  1. Feature Capabilities: the number of features you need within your CDP for your use case.
  2. Data Volume: the number of users you track and store in your CDP.

For some companies, a CDP is simply an event collection tool; for others, it’s an identity resolution platform; and for others, it’s a marketing activation engine. The cost for your CDP will be directly linked to the core features that you need and the specific use case you’re trying to tackle. If you need every component or feature set that a CDP offers, your contract size will definitely be larger. Likewise, the number of users you track will also affect the cost. If you're an enterprise organization with millions of users, you can expect to pay much more than a small-to-mid-sized business with a few hundred thousand users.

For the most basic version of a CDP, you can expect to pay between $50,000 and $150,000 annually. For larger companies with more volume, this quickly becomes hundreds of thousands or millions of dollars per year. This steep cost is one of the main reasons that companies are choosing to adopt a more modular Composable CDP architecture, assembling individual components like event collection or identity resolution around their existing infrastructure rather than buying into an all-in-one platform.

The Best Enterprise Customer Data Platforms

There are a lot of enterprise customer data platforms available on the market today, and trying to understand the differences between them is extremely challenging. With that in mind, here’s a quick summation on the top ten CDPs available on the market today.

  • Hightouch is a Composable CDP that integrates with your data warehouse. The company is headquartered in San Francisco and was founded by former Segment engineers in 2019. The platform is specifically designed to dynamically adapt to the unique data and business-specific use cases of any company across any industry. Unlike traditional CDPs, which have strict limitations around how you can actually use your data, Hightouch is extremely flexible, so you can deliver better experiences, optimize performance marketing, and leverage data to move faster across your organization.
  • Salesforce CDP (or Salesforce Data Cloud) is an offering from Salesforce designed to help you integrate all of your Salesforce data across different orgs like marketing, service, and support into coherent customer profiles that your marketing team can then leverage across other existing Salesforce tools. The product was made generally available (GA) in 2021 and has been renamed/rebranded many times since then. However, unlike other CDPs which are designed to integrate seamlessly with external marketing applications, Salesforce CDP supports a very limited number of external destinations, which can make it challenging to use for many marketing teams.
  • Adobe CDP is a newer product offering launched by Adobe in 2021. The platform is built to help you integrate all of your customer data across the Adobe suite into a unified customer profile that your marketing team can then activate across Adobe-specific channels to power personalized experiences. However, similar to Salesforce, this platform is only built to integrate with existing Adobe products and services, with limited support for external tooling.
  • Segment is one of the most well-known CDPs, primarily known for creating a robust event tracking framework that data teams could use to capture and forward events to downstream marketing tools. The company was founded in 2012 and is headquartered in San Francisco. Since the inception of the company the platform has expanded to include a number of new capabilities like audience management and identity resolution. However, in 2020, the company was acquired by Twilio for $3.2 billion and has since struggled to regain the performance that once made the company popular.
  • mParticle is a traditional CDP that specializes in mobile app use cases for enterprise companies. The company was founded in 2012 and is headquartered in New York. It was originally created as an alternative to Segment to provide a more flexible platform that marketers could use to power their complex marketing use cases. However, more recently, the company has been investing in warehouse-centric capabilities like Reverse ETL to satisfy the demand for warehouse-native capabilities.
  • Amperity is best known for its identity resolution offerings for B2C retailers. Since being founded in 2016, the company has earned a number of patents for its various algorithms. Up until the last few years, Amperity has largely operated as a platform for data teams, but more recently, the company has begun to dip its toes into the marketing world, adopting new but limited activation capabilities for marketing teams to help them drive value from the unified profiles that live in the Amperity platform.
  • Treasure Data was originally created as an analytics and engineering platform for data teams in 2011, but the company pivoted into the CDP space after realizing that many of the use cases that customers were powering were actually marketing-related. The company has launched many CDP features to help create unified and actionable customer profiles, but the underlying architecture of the platform can be challenging for non-technical users. Under the hood, the platform is powered by older technology (notably Hive and Hadoop), which are not usually the preferred choices of data teams.
  • ActionIQ is a hybrid CDP that offers some backward compatibility to integrate with your existing data infrastructure. The company was founded in 2014 and focuses on helping you unify and manage your customer data so you can power your marketing use cases. It was originally built to compete against Segment and mParticle. However, pent-up demand for more “warehouse-native” and “composable” has led the company to re-architect the platform to try and accommodate the flexibility that modern companies are demanding.
  • RudderStack is an infrastructure CDP that helps data teams collect events and manage ETL pipelines. The company is headquartered in San Francisco, and the platform was founded in 2019 as an open-source alternative to Segment that data teams could use to capture behavioral data. However, while the company certainly offers many CDP-oriented features, RudderStack is entirely focused on helping data teams create a solid data foundation that can then be used to power downstream operational use cases.
  • Simon Data is a Composable CDP that runs on top of Snowflake. The company was founded in 2014 and is headquartered in New York. Currently, the platform is built to run entirely on top of Snowflake via a managed instance by Simon Data or through your own existing Snowflake instance. From a marketing standpoint, the platform offers many features to help you build granular audiences and move data out of Snowflake to your downstream marketing channels.

How Do You Implement a CDP?

The “black-box” nature of traditional CDPs makes it difficult to implement because you can’t actually use or test the technology without undergoing a lengthy sales process to scope out your needs and requirements. The actual implementation process of a traditional CDP can take anywhere between 6-12 months, and undergoing a proof-of-concept (POC) is nearly impossible for most CDP vendors because there is quite a lot of engineering work involved in getting these platforms up and running.

Traditional CDP architecture also makes adapting to dynamic use cases very difficult because they’re designed with a strict event spec that you have to follow, and they have no way of guaranteeing event delivery to your downstream tools if your events fall outside of their spec. Storing data can be equally challenging because most traditional CDPs come with preconceived notions about how you can collect and store data because they each have a unique schema that doesn’t necessarily conform to your specific use cases.

The only solution that’s flexible enough to integrate with your existing data infrastructure and leverage your existing schema is a Composable CDP because you can take advantage of the existing schema that lives in your data warehouse, and you don’t have to re-conform your data to another platform. With technologies like Reverse ETL, you can basically circumvent the entire implementation process and start activating your data immediately.

Traditional CDP vs. Composable CDP (Comparision Guide)

Download our comparison guide to understand exactly where traditional and Composable CDPs differ.

  • Event Collection
  • Real-Time
  • Identity Resolution
  • Audience Management
  • and more!

How to Choose a Customer Data Platform?

Choosing a CDP should come down to your specific use case, and you should never buy technology just for the sake of technology. One of the fundamental problems with traditional CDPs is that they come with preconceived notions that inform how you collect and store data.

For example, if you’re a video streaming company, you have to follow the event tracking spec and the schema structure provided by that vendor. Most CDPs only support objects like users and accounts, so if you have custom data science models or other entities like playlists, subscriptions, workspaces, etc., you’ll quickly run into trouble. Anything custom that falls out of the norm is not natively supported, and trying to configure your CDP to enable that type of custom use case is almost impossible.

Every company is converging to a point where they know they need a centralized platform to manage and act on their customer data. However, many companies don’t realize that they already have a data warehouse that is already acting as a single source of truth. This is why leading companies like Bol.com, Zebra, and Chime are turning to the Composable CDP. If you’re looking into CDPs you should thoroughly evaluate traditional CDPs vs. Composable CDPs.

If you’re interested in learning more about the Composable CDP, book a demo with one of our solution engineers or check out our Composable CDP Hub.

More on the blog

  • What is a Composable CDP?.

    What is a Composable CDP?

    Learn why Composable CDPs are seeing such rapid adoption, how they work, and why they're replacing traditional CDPs.

  • Friends Don’t Let Friends Buy a CDP.

    Friends Don’t Let Friends Buy a CDP

    How spending the first half of his professional career at Segment drove Tejas Manohar to disrupt the 3.5 billion dollar CDP category.

  • Traditional CDP vs. Composable CDP (What's the Difference?).

    Traditional CDP vs. Composable CDP (What's the Difference?)

    A tactical guide to understand the key differences in CDP architectures.

Recognized as an industry leader
by industry leaders

G2

Reverse ETL Category Leader

Snowflake

Marketplace Partner of the Year

Gartner

Cool Vendor in Marketing Data & Analytics

Fivetran

Ecosystem Partner of the Year

G2

Best Estimated ROI

Snowflake

One to Watch for Activation & Measurement

G2

CDP Category Leader

G2

Easiest Setup & Fastest Implementation

Activate your data in less than 5 minutes