Building Blocks: An In-Depth Look at Segmentation
Segments in Adobe Analytics can be very powerful, but with that power comes complexities and nuances that can drastically change what data is returned. Even seasoned analysts can get confused with how some of the logic works. This session will dig a little bit into scope (at both segment and container levels), how and why exclude logic differs from “does not equal/contains”, and debunking some common misconceptions that will allow you to take your segments to the next level.
Key takeaways
- Segment scope considerations
- The power of exclude logic
- Choosing the right configuration to succeed
Transcript
Hi, and welcome to Building Blocks, an in-depth look at segmentation. My name is Jennifer Dungen, and I work for Torstar in Toronto, Ontario. I’ve been with the company for about 17 years, starting as a web developer, moving into QA, and then finally taking over the web analytics. I’ve worked with Adobe Analytics at every stage of my career, all the way back to the old Omniture days. I’ve also been an analytics champion and user group leader for the last four years, and an Experience League community advisor for two. If you like my content, I encourage you to join my user group chapter, Gnome East, but I also always encourage people to join other chapters as well, since everyone has such great content. Also, you can find me on the Experience League in the Adobe Analytics community, which is a great resource if you need help. In addition to being passionate about analytics, I enjoy crocheting, reading, playing video games, and DIY projects around the house. I’m a sci-fi and fantasy nerd, and I love digging into the how and why things work the way they do. Before we begin, let’s review the agenda of what we’re going to cover today. First, we’re going to do a quick overview of the Segment Builder. Then we’re going to dive into Segment Scope, both simple and complex. After that, we’re going to compare Does Not versus Exclude Logic. And then finally, we’re going to finish off by looking at some examples. Let’s get started. While most of you should be aware of the Segment Builder, I’m just going to do a quick review of it so that everyone can follow along. The Segment Scope can be selected in the top left of the definition area. The options can be used to add containers or change the scope to an exclude. Containers can be stacked inside of the segment as well as within one another. They’ll be visually distinct by being contained within their own borders. Each container also has its own scope, and you can mix the scope between your segment and your container. And even more important, you can have a larger scoped container stacked inside of the segment. For instance, you can have a visit or visitor scoped container inside of a hit segment. Each container also has its own set of options that provide you a lot of flexibility to build the segments you need. There are a lot of misconceptions about segment scopes. Many people believe that if they want to get back the number of unique visitors, then they need to use a visitor scoped segment. But the metrics you use are separate from the segment definition. Segment scope determines what information is returned. Let’s look at some examples. Segment-based segments bring back any individual hits that match the criteria. So, in this example, I’m looking for hits that contain home. You can see the four hits in my sample that match this criteria, and these four hits will be returned by the segment. When I use this segment in my freeform table and break it down by page type, I can see that I’m properly returning just my homepage hits, but that I can also absolutely use other metrics with this segment. Now, let’s change the segment to visit. The same four hits match the criteria, but because the segment is visit scoped, all the pages within the visits that contain the match will be returned. If you’re only pairing the segment with the visit metric, the total count will be correct. But if you’re breaking your table down to other dimensions, then you will be getting pages that you don’t expect or want. You should also notice that the total page views is inflated by all those other pages that have been returned. Visitor has similar logic, but it will return even more data since visitor scope segments return all traffic for a visitor that has met the criteria in any visit. You can see the same four matching hits, but all hits from all visits will be returned by this scope. If we look at all three of these segments side by side, you can see even more inflation of the other metrics. And even at the individual page type level, the additional traffic from across multiple visits can easily be seen. Very often, the lowest scope segment is going to be what you really need. Not always, of course. There are definitely times where you’re going to need those larger scopes. But as you’ve seen, the hit-based scope pairs with all the metrics without bringing back those additional pages. Let’s take a closer look at the comparison of these segments. From a unique visitor perspective, the total values match. So technically, if you only need the UV total, all of these options would work. However, when looking at the visits metric, the hit and visit segments match, but the visitor segment will include all of those additional visits. Even though at the homepage breakdown, the visits being constrained by that page type dimension all match. Finally, the most profound difference is with the page view metric, despite the page level looking fine. Personally, I like to make my segments as reusable and flexible as possible, even in an ad hoc scenario, because generally people come back asking for more. I’m sure I’m not the only one who has been asked for a simple report. Let’s say UV is the hit the homepage. It’s simple. And based on what we know from our comparisons, we’ll get the same number of UVs back no matter what segment scope is used. But if that same person comes back and says, hey, this report is great, but can we also add a page view per unique visitor here? Well, if you used hit based segment, everything is ready to go as is. If you used any other scope, you’ll have to redo your segments or create additional ones that work. Of course, there will always be times when you still need to update your segments or use multiple segments to facilitate more complex calculations. But starting with simpler segments can help alleviate a lot of those headaches. So what is a good use of a visitor scope segment? This will, of course, depend on your needs and what you’re trying to accomplish. Maybe you need to look at visitors who have made a purchase within your reporting period and how much traffic they’re responsible for. Or maybe you need to see users who have viewed your promotions. Now in isolation, this may not be too helpful, but you can use this in conjunction with other data or in more complex segments to really dive into what’s happening. Let’s take a closer look at a more complex segment, one that uses multiple clauses. Starting again with hit, I’m now going to look for two dimensions to coexist on that same hit. You can see that only one hit in my sample matches and therefore only a single hit will be returned. Similar to the single clause example from before, in this arrangement, we can confirm that only the homepage hits are returned. And if we compare this to the single clause hit, we can see that there is a significant difference now that we’re looking for a second piece of information. Now if we change this segment to a visit scope, many of you will be looking at the sample and wondering why there are so many matches. And no, this isn’t a mistake. While we are looking for a combination of homepage and logged in, this segment is at a visit scope, so that combination does not have to be on the same hit. So long as a visit has at least one hit on the homepage and one hit that says the user is logged in, the entire visit will be returned. This includes visit one, where we have a logged in homepage, which is likely the intended match, but it also includes visit three, where there is a logged out homepage but a logged in section page. When we look at a simple freeform table, knowing what we know about visitor scope segments, we are expecting those extra pages. And on the surface, the numbers look reasonable. But if our intention was to get logged in homepages, then if we dig a little deeper, we can see that we actually missed the mark a little bit. Now, this is sample data, so the difference may be more significant when dealing with real results. Using our actual intention was to look for visits with logged in homepage, we have to use a hit scoped container inside of the visit level segment. This will explicitly apply the logic to look for a hit with the combination of homepage and logged in, and it will return all the traffic from those visits. So now, only visit one will be returned and not visit three. Looking at our freeform table now, the numbers have changed slightly from the last example, but now we can see that the total visits matches the logged in visits, which is exactly what we want to see. Now, some of you eagle-eyed viewers might be looking at that not logged in row and saying, well, you still have some not logged in homepages. And you are correct. This just means that within the same visit, we had both hits that were logged out and others that were logged in. And we have nothing in our segment at this time to exclude those additional hits. By now, you should be starting to see a pattern as we go into the visitor scope. Again, in this scenario, I’m looking at visitors who hit a homepage and were logged in, not necessarily on the same hit and not even within the same visit. You can see matches crossing all of our scenarios here. And if this is what you want, great. Now, given this is a continuation of the visit logic, I’m going to skip the freeform example and move on to showing some more specific segment logic. Once again, I’m going to use a hit scoped container within my visitor segment to ensure that I return visitors who had a logged in homepage view. And while yes, this segment will still return all the visits, it will not return visitors that only had behaviors matching visits two and three. Now that you have an understanding of how scopes work, let’s do one final example using some complex logic and mixed scope, where we use a visit container within a hit segment. In this example, I only want to return my orders, but only for visits where the user interacted with my product page. I’ve included non-segmented data for comparison here. And as you can see from the results, the segmented data only returns values for the checkout pages where the order actually exists. And it also shows a reduced number of orders based on the additional criteria of needing the product page to be a part of the same visit. Now, let’s take a closer look at the differences between does not contain or equals versus exclude. Many people think these are the same, but they’re actually very different. Let’s look at a simple hit segment if I use page does not contain home and the hit I’m evaluating is home page. This does contain home and therefore it doesn’t pass. And the result is not returned. Now let’s look at an exclude segment. Here my definition looks for page contains home. Using the same input value of home page, the definition does pass. Now the exclusion logic is applied and the result is that the hit is not returned. In the case of hit, these segments have different logic but end up with the same result. And from these examples, you might be thinking I can use either of these interchangeably. And from a hit perspective, you’re right. But let’s look at some other scopes. Using the same criteria of page does not contain home, let’s look at some hits from the same visit. Like before, home page does contain home and you would think that the logic should work. But both section page and product page do not contain home. And if any hit matches the criteria, the segment will return true and the visit and all of its hits will be returned. When we build an exclude container for this criteria, like before, the home page does contain home and while the segment will evaluate the other hits, it doesn’t actually have to. The logic returns true based on that single matching hit, the exclusion is applied and the visit is not returned. This scenario has different logic and very different results, one that will make a huge difference to your reporting. By now, you probably know where this is going but let’s just finish this off with an example of visitor scope. Using our same page does not contain home, let’s look at a series of visits from a visitor. Just like on the visit scope, home page does contain home, which shouldn’t meet this criteria. But the other pages do not. And even if one hit matches the criteria, the segment will return true, which ultimately means this visitor and all their visits and hits will be returned. When we build an exclusion segment for this criteria, like in the visit sample, home page does contain home and because this is visitor scope, any hit with home in any visit will qualify, the logic will return true, the exclusion will be applied and in this case, all hits and visits for that visitor will be excluded, nothing will be returned. Just like visit scope, different logic and a very different result. This is not to say that does not logic can’t be used, you just have to think about what you’re trying to achieve to make sure that you have crafted your logic appropriately. Coming back to our previous example, let’s say we only want to see visits that had logged in home pages and we want to exclude the visits where the user hit the home page and were not logged in. This was our original segment and this is our updated segment. You’ll notice that I’ve added both a visit level container set to exclude and inside of that, I have a hit level container to explicitly look for home page and logged in on the same hit. The result will get rid of those additional non-logged in home pages but also it will remove the visits that had both scenarios in the same visit. Since we excluded those visits, the total key metrics will also be reduced. Another concept that can cause confusion is when dealing with dimensions that have attribution such as campaign tracking codes which have a seven day attribution by default. When trying to get interactions back for something like campaign Y, it can be tempting to create a visitor scoped segment but as we know, this will return everything for that visitor including the other campaigns, not just Y. By looking closely at a properly attributed campaign, we can see that the tracking code is set on each hit either explicitly on the pages where the query string parameter exists or through the attribution of the configured expiry. A simple hit segment is all you need to get back your key metrics associated to campaign Y. There is no need to make this any more complicated. Now, there’s one little thing that we didn’t talk about during the segment building portion of this presentation and that is that dimensions also have settings for attribution inside of the segment builder. Repeating is the default behavior, returning all hits that are set or attributed. Instance will only return hits where the value is explicitly set but will ignore the hits that are attributed. And non-repeating instance will return the hits where the value is set but will ignore consecutive hits. When dealing with dimensions like tracking code that have both set and attributed values, this can be useful to look at different attributions to create calculated rates. A good use case would be to create a report based on campaign Y. I’m interested in more than just key metrics. I’d like to know the number of times my campaign was used, the orders associated to my campaign, and the orders per campaign. I will use a standard hit segment for my campaign using the default repeated attribution as well as a hit segment using instance attribution. This will tell me the entries to my site by campaign. Then finally, I can create a simple calculated metric using the two segments and the appropriate metrics. When starting to build a segment, it’s always best to review the specifics of the ask to plan how to go about building that segment. Let’s look at a specific use case scenario using a combination of what we’ve covered. I need to get the number of orders that were made in the same visit that the user signed up for a newsletter, but I don’t want to include repeat purchasers. From this, I know that I will use a hit scope segment because I need to get back only my orders, and I will need two containers, one at a visit scope to look for orders and newsletter signups within the same visit, and a second one at the visitor scope to exclude visitors who have made more than a single purchase. Here is a segment that I made to fulfill that request. Then, I can use it alongside unsegmented orders to create a comparison between all orders and the ones that fall into my scenario. Let’s take a look at our key takeaways. One, segment and container scope is about the set of content you want returned, not about the metric you want to use. Two, mixing scope allows you to create logic that both controls the data sets being returned but allows you to look at larger contexts for that data. And three, while similar, does not logic is very different from exclude logic and if crafted improperly can result in incorrect data. Unfortunately, I had limited time with all of you today, but I am working with Justin on a series of articles centered around segment building. If you’d like to dive more into the intricacies of this topic, I encourage you to check them out. However, we have time right now to get into some Q&A about what you just saw and I’d love to hear from you. Thank you. Jen, that was such a great presentation. I loved all your visuals and how clear you made those unique use cases. Amazing job. Thank you for sharing. Let’s dive into our Q&A. So if you have questions for Jen, make sure you submit those now. Okay. All right. Our first question, Jen says, what are the rules for when containers should and should not be used? Well, there’s not really any specific rules to when you should use a container. Basically use a container when you need to group items like so I showed you know with hits and you want to look for specific hits and multiple different hits using containers to group those or to group visits. You can even use containers simply inside a hit based segment to just group things just visually because you can collapse them, you can rename them. So basically the rules are really just use them when you need them or when you’re just trying to use them to group your objects together for visuals. Yeah, I can’t really think of any specific rules per se to follow as long as you’re following the logic of what you need to get the segment you actually require. Perfect. Awesome. Thank you. All right. What is the best way to QA your segments to know you’re pulling the right ones in more complex segments? Are there any best practices? Right. This one is always hard. Familiarity with your data is obviously going to be key to making sure that you’ve got the correct segments. Often when I’m doing it, I will look at smaller chunks of the logic in isolation. So when I’m trying to build a big segment, I don’t try to do everything all at once and you know, try to look at the data and hope. I’ll look at each individual container is each container bringing back what I expect and then build the logic from there. So I’ll look at individual containers, look at the conjunction of the containers, and then once everything is built, I will try to dig through that data very, very deeply using correlations, using different types of breakdowns to try and ensure that what I’ve got is exactly what I’m expecting. That’s not an exact science, unfortunately. And sometimes you’re going to find mistakes. Sometimes those mistakes may not show up where a couple of months later you realize, oops, something’s not quite working the way I expected it to. And you have to go back to the drawing board. But generally, if you’re familiar with your data, and you’re testing the parts, and you’re trying to do the breakdowns as best you can, you can usually be relatively sure that your segment is in fact bringing back what you expect it to. Unfortunately, without knowing exactly what type of segments you are trying to build, it’s very hard to give specifics to what type of breakdowns or testing you need to do. But that’s where Experience League comes into play because you can always reach out to all those experts. I’ll be there. Other people will also be able to give lots of advice. And we can help try to sort you out. Did you by chance see our next question? Because that was a perfect segue into our next question. Are your articles in Experience League? They are. Well, I’ve got one currently live. And the next five, because I’ve got a series of six being planned out, should be released fairly soon. As you saw, I’m working with Justin to get those rolled out. But I’m also trying to make sure that I’m covering all the bases and all the in-depth more than I can actually do in this short presentation. So keep your eyes on that. But of course, in the meantime, if you have questions, I’m always available to help. Reach out to me. And you may even be basically an inspiration for another article or another example to use in those articles. Yeah. And you’re very active in communities. So if someone reaches out to you, you’ll respond, right? Yeah. Always. Great. Great. And yes, big plug to the articles coming out. They’re going to be fantastic. All right. Let’s move to our next question. Are there any caveats or gotchas with selecting Instance as the attribution model in a segment versus an EVARS instance metric? Oh, I have not noticed that per se. So long as the combination of what you’re using makes sense. I have used regular instances, occurrences, EVAR instances. As you saw, I was showing the repeated versus instance versus non-repeated instance values there. Half the time just for readability, I’ve been known to create a container and just had EVAR equals something and instance of EVAR exists rather than changing that setting, which could be easily missed when I’m reviewing my segments. They all work the same. So long as you’re getting the data that you need, I don’t see any point in always doing one way of doing it. There’s a million and eight ways of doing things. That’s the way Adobe works with its customization. If it works, I say use it. Love it. All right. Here’s the next question. Can we create custom parameters in Segment Builder? Custom parameters in… I’m not sure exactly what you mean by a custom parameter. I’m not sure if maybe if the person who asked that question wants to give a little follow-up to that while we maybe move on to the next question. Yeah, we can see. We’ll watch for that. Okay. All right. Any quick tips or tricks about using dates and segments? This one is a tricky one because dates and segments used to work one way and they now work a different way and it’s a little odd. If you now put a date range into your segment, it actually supersedes the date range of your panel date, which is sometimes not exactly what you expect. Before you might be able to say, hey, find me something within a year that did X, but you’re still looking at your panel range for a month. Unfortunately, that’s going to extend your entire report now to a year. That’s something that I’m actually trying to work with the Adobe product team on because it has now made it impossible to pull certain information that I’ve kind of butted heads on a little bit. My rule of thumb is if you do need to use a date range within your segment itself, just be aware that it is going to extend everything within that segment to whatever date range you have selected. But also, sometimes it might be easier to make sure that it’s definitely less than the panel date range. Otherwise you start getting a lot more data than you expect. Either way, it’s a little weird and interesting and we all have to deal with those kind of issues. For the moment, I’ve tried to avoid date ranges in my segments just until I get used to the new way of doing things. Look at you, Jen, out here fighting the good fight for everyone. This is kind of a related question. Maybe this is the same answer I’ll ask though. Can you use rolling dates inside of segments? Yeah, again, I think it’s sort of that same weird oddity. Now with your rolling dates, you can absolutely use them inside. It’s just what you’re going to get back again might be not what you expect. It might be exactly what you expect. So if you’re using a rolling date range that says, I don’t know, one month ago to two weeks ago, something like that, then that’s the date that you’re going to get or the data that you’re going to get back within your panel, within where your segment is applied. Or again, because segments are now calculating the date ranges from outside of them inside like a little bit differently. Again, that’s why the date ranges inside segments are acting different than they used to. You could always try your rolling date range outside of the segment. Again, feel free to experiment. You might be sitting here in a year or two showing all these amazing things you’ve done with rolling date ranges inside your segments. I’m not going to tell you not to experiment with it because, hey, that’s how we learn. That’s how we make things better. And that’s how we figure out what’s working and what’s not. So give it a try. Experiment, give it a try and then reach out to Jen and let her know it worked on Experience League, right? All right. Next question. Can we mix time range dimension to the segmentation logic? For example, if I want a segment for events that happen at a specific date range, man, all these date questions. That’s because dates are tricky. I get it. Again, the same sort of thing is you used to be able to say something happened within a date range. You know, say visitor hit a certain page within date range and then bring that visitor back now. It used to work like that. It doesn’t anymore. If you put that date range inside your segment saying this event happened in that date range, that date range is actually going to affect the data that comes back, as in that data is going to be part of that date range. That is one of the things that I was talking about. It’s not how it used to work. And it’s something that actually now makes certain things really, really hard or almost impossible to pull. So again, this is something that I am trying to figure out with the product team to say, you know, how do we do this now? Because it’s not exactly working the way we expect it to. So unfortunately, I can’t say right now that you can do that. But again, play with it. Maybe you’ll figure out a way of getting it to work and we can all benefit from that. All right, great. All right. Enough of the date questions. Let’s move on. OK, so someone asked if you can revisit the exclude versus does not contain with a quick overview. OK, well, I don’t know exactly what the quick overview is in this context. Again, maybe if you have specific questions, it’s something that we can take on to Experience League and chat about after the fact. But essentially, when it comes to does not include or does not contain or does not equal versus an exclude, you know, when you’re looking at visit or visitor level areas like at Scopes, it changes what you are actually looking for, because, again, the evaluation is done on every hit. And if there’s any hit within that visitor, any hit within that visit or visitor, essentially any hit that matches or doesn’t match that will then create, you know, in the does not include, it will still bring it back. Whereas an exclude, if any hit matches, the entire thing gets excluded. I think I just messed up that entire description. I probably just confused you even more and I apologize for that. It’s exclude versus does not include or contain or whatever is a very hard and complex topic to wrap your head around. So I understand where the confusion is. Again, maybe looking at this with specific examples would help actually clear this up a little bit more than me trying to spitball here without any visuals to help you out. And to say you have it mapped out so clearly in the presentation. So, tip is go back and rewatch the presentation and your slides because everything’s mapped out so clearly there. All right. I think we have time for probably one more. So when building a sequential segment in Adobe Analytics, is it possible to have one container at the visitor level and another at the visit level? Absolutely. I think one of my examples even showed that. Wherein I might be looking for unique visitors that have made a purchase and visits that signed up for a newsletter and maybe a hit that did something else. You can absolutely combine different containers within your overall segment scope. So remember, your overall segment scope is going to define what is brought back. Is it bringing back hits that match? Is it bringing back the entire visit that matches a criteria or is it bringing back the entire visitor journey that matches one of those criteria? But once you have those containers, again, you can open up that scope to look for, you know, again, I want hits of X. But visitors who also did X, Y, Z and visits that had such and such. So you can absolutely mix them, mix and match. That’s where the complexity comes into play. And again, going back to that first question about how do you QA your data? Well, that’s where you check each thing individually. So you check that visit scoped container with the logic in it. Is that bringing back the visits you expect? You check your unique visitor. Is that bringing back what you expect? And then when you combine the two with whatever scope, is that bringing back logical information to what you are expecting? Right. Awesome. Thanks, Jen. Well, that is all the time we have for Q&A. Jen, again, it’s a special thank you to you for mapping that out so clearly for us and your amazing presentation. As always, it was such a pleasure for me to spend time with you today. Great. Thank you.
recommendation-more-help
82e72ee8-53a1-4874-a0e7-005980e8bdf1