Lessons From a Reformed Data Hoarder

“Collect everything.”

That is, word-for-word, the direction I was given by a client regarding their desired data collection strategy. As a former director at a big global media agency, my job was to help Fortune 500 clients manage their first-party data. When I pushed for more specific information, my client repeated their original direction with a bit more emphasis, “Collect everything you can and get it into the database.”

Like many other companies, my client had recently invested A LOT of money into building a data warehouse and was eager to show off its functionality and prove a return on investment. Teams of people set to work collecting every bit of data that they could. There was no strategic plan; data sets would be identified and collected simply because the data was there to collect.

The truth is the majority of this data served no purpose. Quite often the data was redundant several times over. Significant expense was paid to store data, and more money was spent to identify use cases and value within the data. Several years and several false starts later, this client now realizes the very expensive mistake of collecting data without first determining its purpose.

Why Is Cross-Channel Data Collection So Important?

The rise of mobile, connected TV and the Internet of Things offers businesses a vast array of marketing opportunities across an exponentially increasing amount of devices, technologies and platforms. Consumers are demanding better brand experiences. With data-driven marketing, it’s possible to foster and maintain rich, one-on-one relationships with always-connected customers on a mass scale.

cross-channelHere are just a couple of benefits of tying together the cross-channel world: offline sales can be linked to online activity, and search engine usage can be linked to broadcast television commercials. From building a single customer view, to designing an online ad, to placing a guest’s preferred shampoo in a booked hotel room, technology exists that utilizes advertiser data to target messages more effectively, attribute conversions more accurately and provide personalized experiences to consumers in real time.

Gone are the days of data sets containing only aggregate counts of digital ad display impressions, website page loads or shopping cart activities. Marketers can now connect rich data sets to an individual customer to drive personalized interactions. Very granular user engagement, demographic, geographic and other types of attribute data and metadata are available on nearly every offline and online engagement. This data can be extremely valuable and marketers are demanding full use of the data available.

The Big Data Trap

Somewhere around 2010, Big Data became a very popular topic of collectconversation.

Curiosity about all this consumer data turned to excitement as executives around the globe realized the potential it could have for their companies.

To many marketers (or perhaps most) the process of implementing a data strategy seemed to be a fairly straightforward, three-step, linear process:

Acquire a lot of storage space for data. → Collect a lot of data. → Analyze the data.

Following this process companies would synchronously invest a lot of money, time and resources into building a technology stack to store the data. This step, in itself, can be quite time-consuming and often left the project leadership feeling the need to prove the technology as soon as possible. As a result, the knee-jerk reaction of many was to try and populate the new database with as much data as possible.

The problem with this methodology is that it is very easy to become a data hoarder. It creates pressure to measure data storage rather than outcomes generated by the data. Data is collected with the best of intentions, but similar to the television show “Hoarders,” it quickly consumes all available space, severely limiting speed, agility and efficiency.

I openly admit that I got caught up in this aspect of the Big Data craze. I would get excited each time I was able to collect new data and proclaim, “Just find me a place to store it, and we can figure out cool things to do with the data later.” However, I have learned that just because data can be collected on virtually everything does not mean that data should be collected on everything.

The Path to Reform: Data Collection to Customer Recognition

It is never too late to start a purposeful data collection strategy. As a reformed data hoarder, I like to promote a few basic principles to help keep my client’s data as organized and valuable as possible.

1. Every single data point should have a determined purpose before it is collected.

Beyond organization, purpose has a direct impact on a corporation’s bottom line. Data warehouses require physical space, electricity, climate control, software, security and personnel—among other fixed and variable costs. While it may be nice to proclaim the large amounts of data that can be stored, every bit and byte of data collected has an incremental price, which impacts ROI. Additionally, as data volume increases so does the time required to query, process, analyze and action off that data. If data has no purpose, why take on the extra costs? It is significantly more efficient to find a needle in a box of other needles than the proverbial haystack.search

An important corollary to this principle is that collecting data with the intention of identifying its use at a later date is NOT a valid purpose. Too many marketers have felt as if they were drowning in data, unable to find a signal through the rising tide of noise. But this is avoidable with an immediate, determined purpose preventing your organization’s data from becoming expensive, digital clutter.

2. Every purpose should be SMART (Specific, Measureable, Achievable, Results Oriented and Time-bound).

All data collected should have the following questions defined before collection begins (in no particular order).

  • How will the data be utilized?
  • From where will the data be collected?
  • What tools or technology are needed for collection?
  • In what format will the data be stored?
  • Is any transformation of the data required?
  • Where will the data be stored?
  • What tool or technologies are required to store the data?
  • Will this data need to be joined or merged with any other data sets? If so, what are the keys?
  • What are the implications if this data is not collected?

There is more to data collection than how many gigabytes, terabytes or petabytes are being stored in a database. You should be able to measure ROI from any purposeful data strategy. The expected benefit received from the data should exceed the time and costs involved with collecting, storing, querying and processing the data.

The entire data strategy should be determined to be feasible before activation. Make sure not to fall into the trap of developing and executing a strategy that may not be achievable due to technical barriers or ROI conflicts. Additionally, different strategies require different resources. Data cannot collect, store or analyze itself. Human and technical resources such as database engineers, data scientists and project managers must be in place throughout in order to properly execute any end-to-end data strategy.

Perhaps somewhat obvious, all data collected should support the desired end result. However, just as important, it must be assured that data collection does not impede the efficient completion of the desired result. Collecting data without purpose can create unjustified latency and other unwanted side effects.

  • How will the data improve your understanding of the customer?
  • How will it help you see the whole customer journey?
  • How will you connect digital and offline data?
  • How will the data be connected back to the individual customer?
  • How can the data be used to solve business problems like measurement, engagement and personalization?

The following questions should be answered before the execution of any collection strategy:

  • When will the data start being collected?
  • When will the data first be used for its determined purposes?
  • For how long will the data be used?
  • For how long should the data be stored?

3. Since the goal of marketing is to deliver relevant messaging, data should be quickly and continuously tied back to the individual customer.

connectAddressability is the foundation for meeting the expectations of today’s “always on” customers. If you can’t deliver the right message to the right consumer at the right time, your marketing investment is wasted. This means that linking customer data across devices and channels with a universal identifier should be the highest priority.

Continuously recognizing customers as they move across devices, providing tailored messaging and building stronger long-term relationships – these are objectives that depend heavily on gathering and activating the right customer data.

The Road to Success

targetWhen it comes to capturing the right customer data, the stakes are high. Research shows that first-party data provides the richest, highest quality insights into consumer behavior. Marketers across the board are becoming more reliant on this valuable data source, with 82% saying that they plan to use more first-party data in their campaigns.

Deciding what data to track can be tricky. But it’s important to remember that your data collection strategy should be driven by your specific marketing goals. If you truly understand what customer actions matter to your business, you’re going to be well-positioned to decide what data you need to collect.

Originally published October 12, 2016

James Kupras

James was a Senior Solutions Consultant at Signal who is passionate about providing technology-based, data-driven solutions to today’s marketing challenges. He has significant experience working with and integrating SEM, social media, digital display, CRM, email and analytics platforms into unified, customer-centric marketing technology stacks.

Subscribe for Updates