How to Get Into Big Data Analytics

by Eric Tsai

How to Get into Big Data Analytics

Albert Einstein once said “information is not knowledge” and data without context is just organized information.

In essence, data is just people doing stuff.

The true value of data is far beyond obsessions with key performance metrics.

For most businesses, it’s about extracting insights to create value that has the potential to drive innovation to improve products and services.

In fact, more companies are shifting their focus from traditional business intelligence (BI) to predictive analytics – using historical data to predict future events.

Understand the World of Big Data

To put things in perspective, according to IBMwe create 2.5 quintillion bytes of data – so much that 90% of the data in the world today has been created in the last 2 years alone.”

There is so much data coming in at such a high velocity in all types of complexity that this phenomenal we called big data is now a problem for most businesses.

In fact, there are so many challenges in dealing with big data that it’s often hard to process let alone understand.

This is especially true for any business that engages with digital advertising or online marketing.

This is why it’s important to maintain focus on business objectives in addition to all the online marketing tactics because like the author of the book Antifragile, Nassim Taleb wrote, “We’re more fooled by noise than ever before, and it’s because of a nasty phenomenon called “big data.” With big data, researchers have brought cherry-picking to an industrial level. Modernity provides too many variables, but too little data per variable. So the spurious relationships grow much, much faster than real information. In other words: Big data may mean more information, but it also means more false information.”

It’s meaningless if we have the means to analyze the data but the data is wrong to start with.

And of course we also need reliable data which is exactly why Samuel Arbesman, the author of The Half-Life of Facts, encourages us to start thinking about long data.

The point is that whether you’re doing marketing or product development, we need reliable data to help us make better decisions.

How to Get Into Big Data Analytics in Online Marketing

Just like you wouldn’t expect a musician to compose a song without a tune, or a restaurant to open without a menu, you can’t expect to develop a strategy or execute a tactic using data without knowing what you want to achieve.

This is at the core of any data-driven performance marketing – makes decision based on analysis to prove or disprove hypothesis.

It’s about running tests, collecting data, analyzing results to find the story the data seeks to tell.

If we’re going to become better in performance marketing, we also need better tools and processes transform big data into smart data.

Here are 7 ways you can get into big data analytics.

1) Focus on Business Objectives

Don’t collect data because you can, collect data because it’s necessary. Identify the core problems that have to do with meeting business objectives.

Speak the right language to the right people as different stakeholders in business have different goals that they focus on.

If you’re focusing on impressions, clicks, CTRs, and CR, and the person you’re dealing with only cares about ROI, CPL, and CPA you’re going to have a hard time communicating your value.

Learn to translate your data into terms that’s tailored for your audience.

2) Understand Business Infrastructure

Realize that you will need to understand technical infrastructure such as web hosting, data warehousing, and how data flows in and out of business infrastructure.

In addition, recognize that every business utilizes a variety of applications behind the technical infrastructure.

So make sure that you have some basic knowledge of how each of those applications work and what other tools are available to help you integrate more useful t data.

3) Take the Data Science Approach

You need a multitude of skills to stay at the top of your game, but most importantly you need to become a data scientist. This means investing in learning more about statics, analysis, experimentation, and data visualization.

These skill sets are now in high demand as big data proliferates.

Data scientist is about performance marketing, you need to be the one leading the charge in research and delivery of business intelligence.
Ensure your data integrity will be tremendous for segmentation and optimization.

4) Integrate the Entire Conversion Journey

In the search engine marketing world, a conversion means either a sale or a lead. KPIs such as CPL (cost-per-lead), CPO (cost-per-sale), AOV (average order value), or even ROI are typically what SEMs deliver on the frontend.

However; few SEMs talks about lifetime value or backend conversion metrics that enables you to get a clear picture on the full conversion funnel.

For example: if your frontend click-to-lead CR (conversion rate) is 10% at a $30 CPA (cost-per-acquisition) but your backend lead-to-sale CR is 20%, your actual click-to-sale CR is actually just 2% which means your CPA is actually $150.

All businesses want to know their true return on their marketing dollars; this is why if you don’t have the backend data integration, the frontend data can be very misleading.

And if you have the right data integration, you can proceed to optimize towards the most important KPI, which often times is NOT the frontend metrics.

This applies to offline data as well since TV, radio, print, or even billboards can drive traffic to your website, it’s important to take those media cost into consideration. And don’t forget about other cost of sales attributes such as call center or cost from other channels.

5) Leverage Web Analytics

web analytics is a great place to start your data journey. It tells you where people came from, where they clicked, how long they stayed, what pages were visited, and a whole lot more.

Web analytics puts context to your visitors to your site by adding behavioral data that reveals intent. Someone that searched on a branded term will most likely act differently than those that did not. The same applies to the length of the query.

In fact, even Google uses real human raters in addition to its algorithm to rate content because real human experience is what Google’s search engine tries to mimic.

6) Tell a Story via Data Visualization

Human beings are hardwired to pay attention and remember stories more than anything else. And we all know that a picture is worth a thousand words.

So what’s better than translating your data into graphs or diagrams to help you narrate your story?

The idea of you presenting the data is not to confuse your audience but to communicate fully the integrity and the meaning of your analytics so they can understand it, and take action against it.

Storytelling in the context of data visualization depends on how you balance the visual narrative against your target audience’s ability to discover and interpret.

If you’re to produce great data visualization, I highly recommend that you take a look at Edward Segel and Jeffrey Heer’s paper called “Narrative Visualization“, in which they’ve identified three distinct genres of narrative visualization.

7) Start Predictive Analytics

A great example of predictive analytics being deployed can be seen in Google’s Instant Search. It predicts what you’re trying to search before you finish typing to save you 2-5 seconds per search, guide your search, and load search results instantly as you type.

In fact, predictive analytics are what’s powering recommendation engines of companies such as Netflix, Facebook, Amazon, LinkedIn,, and more!

These predictive analytics are often utilized as conversion optimizing features inside products, such as ad targeting, recommendations, personalizations, and more.

It may sound far beyond our ability to predict the future, but the truth is that predictive analytics is about identifying and exploiting patterns.

The first step is to understand how to leverage techniques in statistics, modeling, and programming.

However; you can start by doing simple projections or forecasting then gradually move into more sophisticated techniques.

You don’t even need anything fancy, just some basic Excel skills will do to get started.

The Take Away: Big data analytics is here to stay.

One of the most fascinating things I get to do at work is to look at data from SMBs to Fortune 50s.

We try prioritize our decisions to spend our client’s investment based on data because it’s what we do – performance marketing.

I can’t stress enough the importance of statistics and its supersets econometrics and data science in solving real life problems.

Great online marketing strategies aren’t just about the tactics on traffic acquisition or conversion rate optimization (CRO); it’s about getting the most out of your marketing dollars.

It requires you to understand the connection between your marketing activities and the broader business objectives.

By integrating rich, relevant business data and powerful analytics, big data allows businesses to quickly assess emerging trends, identify correlations, and take meaningful actions.

Web Analytics Strategy – How to Use Google Analytics to Gain Actionable Insights

by Eric Tsai

Web Analytics Strategy: How to Use Google Analytics to Get Actionable Insights
Effective Internet marketing strategies are built via insights from web analytics strategies. The goal is to abstract insights from web analytics to improve your campaign continuously. A simple way of looking at is to understand how media (or traffic) flow in and out of your website.

In fact, media can generally be categorized into paid, owned, and earned media concept.

Understand Paid, Earned, Owned Media

The idea is simple, paid media is anything you pay for to gain reach, traffic, viewership, or awareness via search, display, television, radio, print, or direct mail.

Earned media is basically PR you get when someone mentions your brand in the public arena which includes word-of-mouth that can be stimulated through viral and social media marketing, conversations in social networks, blogs and other communities. However; it still requires an investment to generate the PR.

And finally owned media is just media owned by the brand. This includes a company’s websites, blogs, mobile apps or their social presence on Facebook, Linked In or Twitter. Offline owned media may include brochures or retails stores.

The bottom line is that paid, earned, and owned media dictates how marketing budgets are allocated and web analytics can help you gain insights to make better informed decisions on budgeting, reporting and investing across all media.


Moving forward, these types of media will converge more and more and it’s important to have intimate knowledge of how each media interacts with each other. If you’re interested to learn more, I encourage you to take a look at the latest report by Altimeter Group below called “The Converged Media Imperative: How Brands Must Combine Paid, Owned & Earned Media“.

Web Analytics is Business Analytics

Web analytics are NOT just for the reporting team or the “experts”, it should belong to everyone. This will enable participation from all departments to slice and dice data about their part of the business and more importantly, act on it!

When it comes to web analytics tools, there are many choices such as Woopra, Clicky, Tableau, Omniture SiteCatalyst, and Coremetrics Analytics.

Since I’m not trying to compare the different web analytics tools, I’m going to focus on Google Analytics because it’s simple to learn and easy to use which I personally believe should the goal of all analytics. In addition, there is already a ton of resources out there about how to utilize Google Analytics so if you ever run into trouble, just Google it.

Another nice feature about Google Analytics is that it integrates nicely with other Google applications such as Google AdWords for paid search (PPC), or if you’re doing search engine optimization (SEO) it also provides Google Webmaster tool access.

However; the true power of Google Analytics is the ability to quickly identify your traffic behavior, media effectiveness, and conduct deep dive analysis for actionable insights.

Inside your web analytics you will find data such as keywords that drive traffic to your website, referral sources that sends you traffic, and how your PPC campaigns are doing from a lead and sales perspective.

If you know how to interpret the data, you will be able to understand how your paid, earned, and owned media interacts with each other. This allows you to focus on doing things that work and stay efficient with your time and resources.

Simply put, in today’s online marketing world, web analysis is business analytics.

I’m going to go through some simple way to get you started and for those of you that are already familiar with the basic stuff, I encourage you to go through Google’s own Google Analytics training course, which is the study material for GAIQ (Google Analytics Individual Qualification) certification.

Understand Traffic Behavior

Everyone knows the importance of ranking for certain keywords, but do you know why you should or shouldn’t rank for certain keywords?

How can you tell if you’re getting the right traffic or not when someone links to you?

Do you know why your PPC brand campaigns racked in 50% more sales when you didn’t make any significant changes to the campaign?

Google Analytics can help you isolate and identify what’s going on with your media.

In Google Analytics, there is a section called Traffic Source, this is where you’ll find what channels are sending you traffic. The goal is to have a good balance of traffic acquisition strategy.

Working heavily in the search engine marketing arena, I often see large investments in paid search, and then followed by organic search, then display, email, and content.

The reason is simple, paid search will provide the fastest return on investment, it’s fast to setup, easy to test, and you’ll get results immediately.

Below is an example view under Traffic Source > Overview.

Google Analytics Traffic Source

Typically you want to start by looking at a large time frame from 30, 60, 90 days to 6-12 months. This allows you to add seasonality and shift in budget (media strategies) into consideration.

The goal is to get familiar with each traffic source the website gets and their behavior. As you can see in this particular example, this website gets 72% of its traffic from search!

The positive is that it has 20% from direct traffic source (people typing in the website URL or came back via bookmarks) and I know this client does a lot of radio and TV ads (offline paid media), so it’s good to get some solid data that shows those efforts are paying off in the form of direct traffic. In addition, with increase in brand recognition and awareness offline, there often will be a halo effect that will help fuel brand searches online as well.

However; the downside is that this business is essentially “renting traffic” because if you parse out the different between organic and paid, you’ll find that paid is about 46% of total traffic and organic is about 25% of total traffic. (Go to Search > Overview, then click on advanced Segment and select paid search traffic and non-paid search traffic).

Google Analytics Search Traffic Overview

Why may this be a potential downside?

Basically if you stop doing paid search, you’ll stop getting sales because traffic volume = sales volume. Keep in mind that you should always focus on “relevant traffic” not any traffic because all traffic are not created equal.

In the case of paid search, you’re buying (or bidding) on keywords that are proven to convert.

Another way to view all your traffic is by selecting Paid Search Traffic, Non-Paid Search Traffic, Direct Traffic, and Referral Traffic in the Advanced Segments section since it’s basically PPC, SEO, Direct, and Referrals (people linking your website).


Then go to the Audience > Overview section to view the behaviors of each channel.

This is where you’ll find interesting data comparing, visits, visitors, pageviews, pages/visit, average visit duration, bounce rates, and percentage of new visits.

Using the data below as an example, you’ll find that not only does paid search brought in more traffic, the traffic looks to be very relevant because traffic that came in via paid search shows a higher number of pages per visit, stays longer, and has the lowest bounce rate.

Google Analytics Audience Overview

And with the same Advanced Segment selected, you can click on the left navigation area to go to Conversions section to view either goals or ecommerce sales numbers.

Goals are typically used for a set of “desirable actions”, so it can be a sale, a lead, a download, viewing of a page, viewing of a video, etc. It’s commonly used for lead generation clients. And ecommerce is usually for financial transactions typically for retail or anyone selling products or services online.

The example GA account here happens to be an ecommerce business so we can view sales data under Ecommerce > Overview to see if those engagements data above turned into sales (for viewing the data in the chart, I recommend to view it under transaction to see sales volume, default sets it to conversion rate).

Google Analytics Ecommerce Conversion Tracking

Looks like Paid Search’s conversion rate is about 5.33% which is much better than SEO (Non-Paid Search) and Direct traffic, but how come referral has such as high conversion rate at 25%?

Which website is sending traffic to us that’s converting at a rate of 1 out of 4? Can we put more money behind it?

The answer is in the chart.

You can see that there are 4 spikes in the last 6 months from referral traffic (purple), those are actually an internal email deployment which is why you see a spike in conversion rate (select Ecommerce Conversion Rate).

Google Analytics Ecommerce Conversion Rate

Interesting enough, when those internal email campaigns were deployed, there appears to be a spike in direct traffic sales as well.

This is because those that received the email may click on the email (which then gets tracked as a referral), came back to the website via a bookmark or they type in the website address directly (then gets tracked as direct) to complete their transaction. Or they may not click on the email and simply go directly to the website or search on Google for a coupon and gets captured by the paid media campaigns.

And since Google Analytics tracks the conversion funnel you can verify this by isolating the date range and visit one of my favorite features of Google Analytics under “Multi-Channel Funnels“.

The Conversion Funnel (Google Analytics Multi-Channel Funnels)

The conversion funnel basically speaks to the concept of “the converged media”, people don’t just convert on the first time they engage a media because media is fragmented just like our attention online.

This is why it’s important to understand your conversion funnel as part of the pursue to excellence in web analytics.

Inside Google Analytics, under Conversions > Muti-Channel Funnels > Top Conversion Paths, you will get data on how conversions happen from first to last click in a given timeframe.

So for us to verify that this client’s referral traffic has an impact on other channels, we need to isolate the timeframe in which the internal email was deployed versus the same time range in the previous weeks.

Then isolate the conversion types you want to see by typing in “referral” in the search box, it will then reveal all conversions that contain referral clicks in the conversion funnel (see below).

Google Analytics Top Conversion Paths

What you’ll see is a positive increase in conversions across the board for all conversion that contains “referral” in the funnel. And since we don’t expect to see 300+% increases in conversions every week, it’s safe to assume that it’s due to the internal email blast as other channels that came in contact with referral also saw a lift.

You can also track it by tagging the email campaigns correct, just go to “Secondary Dimensions” and select Campaign.

Google Analytics Conversion Paths Secondary Dimensions

Tag & Track Campaigns: Google Analytics Custom Campaign Parameters

If you want to learn how to tag your campaigns, simply use Google’s URL Builder, follow the instructions below and tag all your campaigns to see them in detail in Google Analytics.

  1. Go to Google Analytics Custom URL Builder.
  2. In the Website URL field, enter the destination link you plan to send users to (typically it’s somewhere on your website).
  3. Fill in the Campaign Source to identify the origin of the visit (Google, Yahoo, Facebook, Twitter, Email vendor’s name like Aweber or MailChimp, etc.).
  4. Fill in the Campaign Medium to identify the channel for link delivery (cpc, organic, email, tweet, etc.).
  5. Campaign Term and Campaign Content input fields are not required, only use this if you want to identify specific keywords and ads associated with your campaign. (e.g. you can give certain campaign your brand keyword because you want to view the data that way, or give the dimension of your banner to identify which banner was clicked on).
  6. Fill in the Campaign Name to identify the campaign that the link is associated with so it may have multiple links rolled up under one campaign. (e.g. NewYearPromo1 or FebSale3).
  7. Click the Generate URL button and Google will create the URL based on all of the campaign parameters specified above.
  8. Add the new URL in a spreadsheet so you can keep track of the campaigns and be able to see how the various parameters are named.
  9. Use this custom URL when sharing links for your campaigns.

Google Analytics URL Builder

I highly recommend that you do this for all your paid, owned, and earned media as much as possible.

This means when you provide a link in social media, you should tag it. When you provide a link for your affiliates to use, tag it, or have them tag it the way you can identify them. When you’re deploying emails or inputting destination URL for your paid search campaigns, tag it!

Once you tagged your campaigns with Google Analytics URL Builder, you can then go to your Google Analytics, under Traffic Source > Campaigns to locate and analyze your campaigns.

Switch between Site Usage, Goal Set, and Ecommerce to view data for each campaign.

You can see a great example below on how each campaign is identified by the source/medium here.

Google Analytics Campaign Source

And again, utilize the Secondary Dimension option to pivot other data (such as ad content, keywords, geographic locations, or visitor behaviors like visit duration, page/visit etc.) against each one of your campaign for even deeper analysis!

Finally once you tagged your campaigns you will be able to fish them out of the Multi-Channel Funnel by creating your own channel grouping with your campaign naming conversions so you can actually see for example, the specific paid campaign was clicked on after viewing a display banner ad from a specific source.

Google Analytics Assisted Conversions Report

A well-defined channel grouping should contain enough data so you can easily identify how your campaigns are doing holistically from SEO to PPC, from email to display, you should be able to see how your channels interact with each other and utilize that information to optimize for better performance.

Here is an example channel grouping that I’ve created.

Google Analytics Custom Channel Grouping

After you’ve done all of the above, you’ll get a much better view of your campaigns without doing a ton of data pulling or having concerns about piecing together assumptions without reliable data.

Example below showing a report of a conversion funnel that contains 2 or more touchpoints.

Google Analytics Conversion Paths with Channel Grouping

So how can the multi-channel funnel data be useful? Better yet, can it be actionable?

You bet it can!

In fact, to better understand how multi-channel funnel report can be actionable, we need to understand attribution modeling.

Attribution Modeling: Last-Click versus Reality

It’s a known fact that the Search Engine Marketing (SEM) standard for attributing a conversion is measured on the last-click basis. This means that all of the credit of a single conversion goes to the last channel that converted.

Take the below conversion path as an example.


Although display initiated the engagement with the prospect and contains 2 out of the 5 total touchpoints, on a typical SEM report, paid search would get 100% of the credit.

And for many years, the search marketing world has debated many different ways of attributing credit via what’s called “attribution modeling” methodology.

The problem with attribution modeling is that it still doesn’t give you 100% of the picture even though Google works very hard to provide us as much transparencies as possible.

The truth is, there will never be a 100% way to do attribution modeling because true attribution modeling is basically calculating ROI (return on investment) on marketing analytics, not just web analytics. And Google Analytics focuses mainly on web-analytic-based attribution modeling.

This means that the attribution may become more bias towards what’s happening on-site instead of a more holistic approach looking at offline and off-site related marketing efforts. It can be very challenging to attribute offline sales to online efforts and vice versa, not to mention there will often be a disconnect between multiple devices within a true conversion funnel (smartphones, tablet, PC).

Now that I’ve provided some arguments against attribution modeling, now let’s look at the positives of trying to give credit where credit is due.

  1.  By doing attribution modeling, you will at least start to consolidate all your media starting with everything online and on-site
  2. Attribution modeling will provide you a holistic view of your paid, earned, and owned media
  3. Google Analytics makes it simpler and easier with the new Attribution Modeling Tool

Let’s take an example using the Assisted Conversions report below.

Google Analytics Assisted Conversions

This report reveals how many conversions were assisted by each channel (Assisted Conversions), how many were completed by each channel (Last Interaction Conversions), and ratio between these conversions (Assisted Conversion Value and Last Interaction Conversion Value).

The ratio of Assisted/Last Interaction Conversions reveals the strength and weakness of each channel’s ability to assist another channel to convert.

Basically, the higher the Assisted/Last Interaction Conversion ratio is, the more that channel shows up in the conversion path of another channel, resulting in higher assisted conversions than last interaction conversions.

Looking at the above report, you’ll find that the highest assisting channel is Google Display Network (GDN). This can mean different behaviors but mainly it’s a good sign that the display channel (banners or text ads on another website) for this business helps to fuel sales to other channels.

In fact, it doesn’t convert very well within its own channel because it received less total last-click conversions than assisted conversions.
Ok great, so now I know which channel helps other channels but how can this be actionable?

This is the beauty of Google Analytics Attribution Modeling Tool.

If you go to Conversion > Multi-Channel Funnels > Attribution Modeling Tool, you’ll find several attribution models awaiting for you.

These are the default attribution models provided in Google Analytics.

Google Analytics Attribution Modeling

Here’s more information on Google Analytics Attribution Modeling.

So now with the same custom Channel Grouping selected, I can select up to 3 different attribution models to compare and get an idea of the shifts in conversions (and conversion values).

Google Analytics Attribution Modeling Tool

As you can see from the attribution model above, GDN (#9) stands to gain the most with both the time decay model and the position based model.

Let’s take time decay attribution model as an example. The 32% increase shows that GDN played a significant role as “assisting touchpoints” to the time of conversion but not effective as the last-click touchpoint conversion, otherwise it would not have a rather large increase in conversions from this attribution model.

Looking at the position based model, you can see that GDN is supposed to get a whopping 41% increase in total conversions!

This is because position based attribution model often assigns 40% credit to the first, 40% credit to the last interaction, and 20% credit to the interactions in the middle. A simple way to looking at it is that it focuses on both the “introducer” and the “closer“.

And since we know from the Assisted Conversion report that GDN doesn’t convert well as the closer (last-click), this means that GDN is more likely seeing the increase in conversions as the “introducer” (first touchpoint) while other channels were more effective in closing the sale (last touchpoing).

Pretty insightful right?

You can then proceed to down the above data to a spreadsheet, add the marketing cost associated to each specific channel, and re-adjust how you credit each channel and voila: a different way to look at CPA, CPO, CPL, or whatever ROI metrics you want.

You can even drill down to specific campaigns using the Secondary Dimension feature if you tagged your campaigns properly!

Now we understand how channels affect sales through attribution modeling, this means that you are one step closer to what’s REALLY working and what may not be working as well as you thought (like how brand search campaigns are almost always overrated!).

Last but not least, keep in mind that like all web analytics system, there are limitations with Google Analytics.

For example, the look back window is only 30 days. For businesses with longer sales cycle, especially those with high average order value (AOV), 30 days just isn’t enough. And you also need to realize that Multi-Channel Funnels do not take the campaign cookie into account when reporting direct traffic.

Looking at what the industry is doing you can see just how attribution modeling have an impact on marketing budgets according to a study done by Google.

The take away on Web Analytics & Attribution

It goes without saying that data integrity is essential for marketing analytics, not just attribution.

You do attribution because you want to get to the bottom of your marketing efforts. It’s a complex process of giving credit to your paid, earned, and owned media. It’s about translating the value of your marketing programs.

We’re talking about segmentation, media buying, content management, optimization, and a whole lot more!

And don’t forget whatever metrics you’re tracking and measuring, they must align with business objectives, agreed upon across departments (or at least as many as possible).

Web analytics is part of marketing analytics, it requires new process and technology; but most importantly it requires change – you, your team, your management, or your organization must understand and support the adoption of utilizing analytics for it to be effective and actionable.

And ultimately attribution modeling should be part of your marketing efforts to break the department (channel) silos and move towards integration.

Truly integrated marketing campaigns will have great marketing analytics with sophisticated attribution modeling.

I hope you find the above information useful, feel free to share your thoughts on Google Analytics below!

Bonus Google Analytics Resources

Now that you’re totally in love with Google Analytics, here are a few more resources to help you become a GA ninja!