MBI - Ask an Expert
Session 1:
- What customers say about Commerce Intelligence
- Business transformation through intelligent data
Session 2:
- SQL optimization using Common Table Expressions
- Best practices in SQL-based reports
- SQL-based reporting demo
- Questions and answers
Recorded webinar content
Ask an Expert is comprised of two sessions:
Session 1
Welcome everybody and thank you for joining us today for the MBI Ask an Expert webinar session one. This is the first of two sessions. Today we’ll cover some topics around MBI and business operations and we’ll continue this on Thursday with some more technical topics. So please join us again later this week at this same link. I’m Webster Love and I’ll be your host today. Joining me are my co-hosts, Sonika Viraneni, and your presenter for today’s session, Deepak Kumar. We’re all from the product team for MBI at Adobe. We’ll also be featuring a very exciting guest speaker, Shane Newton from Pixels, and Deepak will introduce him in a minute. Today we’re playing a recording of Shane’s content and then we’ll have Shannon Dupock doing a live Q&A at the end of our session. Before we dive in, we have a couple of technical notes. For any audio issues, audio is through Adobe Connect, so your speakers or headphones must be unmuted and then ensure that the speaker icon is on at the top of your Connect room. We don’t currently have an option for calling in via phone. Adobe Connect doesn’t always work well with Bluetooth headphones, so if you’re using those and having audio difficulties, you can try switching to wired headphones. You can also try logging out of the room and back in, or switching to the desktop app instead of the browser version. Throughout the webinar, if you face any technical issues with your audio or the screen share, or if you have questions related to the functionality of MBI or to the content that’s being presented, please feel free to put those questions in the Q&A pod at the left side of your screen, as we’ll be monitoring that while recording today’s, well the recording of today’s content, please. We’ll try to answer those questions either right away through the chat or during our Q&A segment at the end of the webinar. We will be sending a link to a recording of the webinar out afterward. Alright, now I’ll hand it over to Deepak.
Hi everyone, thank you for joining this webinar. We have with us our guest speaker, Shan Newton. I’m going to introduce him shortly. This webinar is a great platform to understand five key aspects. What do our customers say about MBI? Business transformation through MBI. How MBI is contributing to our client’s growth. In this session, we have a success story of one of our prestigious clients, Pixels Inc. SQL optimization using common table expressions. Our MBI users playing a data analyst role can get benefited with CTE style of writing SQL queries. Best practices in MBI SQL base reports. A quick demo on creation of SQL base reports using common table expressions. We will wrap up with a Q&A session.
A little bit about Shan’s background. Shan has two master’s degrees to his credit. Four years of experience working with American Airlines, the world’s largest airline. He worked at American Airlines headquarters in Dallas, Texas. He was associated with several departments in and around digital marketing as an information architect and senior developer. He helped the company transition from a legacy monolith codebase to cloud-based microservices. He also spent six months as a full-time A-B test developer using Adobe Target’s advanced features. Namely, JavaScript injection. Shan left American Airlines to join Pixels Incorporated, the world’s largest image editing service for e-commerce. Where he transitioned into data science and business intelligence reporting. Shan has been with Pixels for three years now. He helped the startup become profitable through an organizational transformation in which data science and Magento BI played a key role. He also manages Google Analytics as a professional certification in search engine optimization. And manages other data systems at Pixels. MBI in particular. He has written tens of thousands of lines of SQL statements. Author over a thousand custom reports. He manages dozens of dashboards for various departments. He’s a super user of MBI and has pushed the limits of what MBI can do. And he has even found workarounds to push those limits further. Thank you Shan for being with us. Hi Deepak, thank you. Very nice introduction. Could you tell us more about Pixels? Well, Pixels is, as you said before, the largest image editing service in the world for e-commerce. And it’s for e-commerce only. So we’re not retouching photos of weddings or portraits. This is purely for e-commerce. And if you are a retailer or a producer of products and you want to sell your products online. You need to have great photographs of your products. Probably multiple photographs per product from different angles. And many companies have tens of thousands of images per month of new products. And so this is a really interesting industry. And every image is data. Every image that comes to our company is a row in a database. And then what happens to that image? There are people editing these images and it goes through different steps of editing. So there’s different workers touching these images. Each one of those touch points is a piece of data in our database. So we have a really interesting ecosystem of data at our company. And we have this really great online platform for studios. You can see here, this is an idea of the type of editing that we do. So Photo Studio is taking a photo of a product on a mannequin. They send it to us along with thousands of other images. And we remove the background, we remove the mannequin, and we return back the final product. And there’s a lot of players in this industry. There’s not a lot of players in this industry that have online software that complement this industry. For example, maybe 10 years ago it was very common to send all your images over an email or Dropbox. But we have a software package in our browser. So you log in, you can upload images through that, you can see the before and after of what’s done. You can mark rejections and send them back to us to rework. There’s color matching, there’s all sorts of interesting things going on. We have AI technology that assists our editors. So we even have an API. And so there are customers who have built applications that plug into our API. And that whole system is automated. So for example, this company has an app where they can take a photo of their furniture. It can be any background, it could be in your home. And it automatically sends that to us, we remove the background, send it back to the app. So it can be beautiful.
That’s awesome. You have been working with MBI tools since three years now. What was your experience like working with MBI? Well, when I first joined Pixels, our data team was not focused on Magento BI. And I don’t know the reason for that. But we had a data team of four data scientists. And most of our queries would go directly onto our relational database. Queries would run slow. People weren’t sharing code or queries. It was a different kind of environment. As we learned how powerful Magento BI is, we began using it more and more. So currently, we have 109 users, we have over 500 dashboards, and we have over 9000 automated reports currently in Magento BI. We actually pulled those statistics this week. So we have over three databases actually pouring data into the MBI data warehouse and over four terabytes of data. So I think four terabytes is pretty large. For a data scientist or BI analyst, that’s a great place to be. We like data. So here’s an example of one of the dashboards that I’ve made in MBI. And I like showing this. Of course, I blurred out a few things. I can’t give all our secrets away. This is a niche product, a niche service that we offer. And so what I wanted to show about this dashboard, one is having small statistics at the top. You know, when you build these dashboards, it can be however you want. If you want a giant chart at the top, you can have that. It could be just a plain old data table. It looks like a big spreadsheet. But this is how I like to design them. Something high value and interesting at the top. So at a quick glance, you see exactly where you were last month and where you are this month. This makes dashboards kind of fun. Like you want to check these dashboards regularly so you can see how things are moving. I like to have a row of a few charts, high value KPIs. And so at a quick glance, you’re getting everything above the fold. And then a bit lower, I like to have larger charts, data tables, and everything else. So as we talk about some of my opinions about business intelligence and tools. And so if I were on a leadership team and I’m thinking about a new organization and we’re looking for a solution for business intelligence, maybe we have an old vendor, maybe we don’t, but we’re just looking around for the best tool in the industry. I would not feel attracted to a purely codeless point and click kind of BI tool. I see a lot of this marketing. And I know that a lot of people feel the same way that I do. That codeless is a bit overhyped. And there’s a lot of reasons I say that. But I really believe that code first, data analysis, and business intelligence is the best way for a company to make progress. So in MBI, luckily there’s both kinds of Report Builder. So there is the point and click visual Report Builder. And it is great. It’s fast. It’s useful. It has group by features. But SQL Report Builder is where I spend 99% of my time. Out of those thousand reports that I’ve made, I would say maybe less than five have been built in Visual Report Builder. And I would challenge anyone. I think I could use SQL Report Builder and be faster or just as fast as the Visual Report Builder. Just to drill down on some of those problems encountered by codeless point and click tools. For one, knowing all the filters and the nuances of the data can be really challenging. It’s also possible to make charts and reports that give you answers. But those answers aren’t necessarily correct. And they might not line up with what other reports other people are making. And so that comes to this problem of having a single source of truth. And that’s something that a lot of companies are struggling with. Lastly, it’s easy to forget why you did certain things. It’s a little bit more difficult, in my opinion, than having code. With code, I can copy and paste work from someone else that I trust or from a report that I trust. Whereas with a point and click interface, sometimes it can be difficult to duplicate some of those filters and nuances that should be applied to various reports. And there’s a lot of people writing about this online. But today I want to focus on the SQL Report Builder. OK, so let’s look at a typical MBI setup. So this is how our company is set up with Magento BI. We have a live database. And this is normal. It would be for the application only. You would have a separate database that’s for employee facing tools and for your BI reports, data science queries. But that’s still a traditional database and it’s syncing every night. Maybe you have a different sync pattern, but I think this is a very common thing that we see in enterprise organizations. So live DB for the application only, read only DB for BI and employee tools. And then we have our Magento BI Data Warehouse is syncing data from our read only.
So this does mean that our MBI data is 48 hours behind the live DB. But that’s something that we can live with.
So talking about some of the benefits of MBI for us, I can give a really good example. There is a query. It’s extremely large. It’s about 500 lines. And it’s extremely important for our organization. It really helped us go through an organizational transformation using this one query. It gave us a new type of insight and a new KPI. So we look at this, the results of this query every day.
It’s a really complex query and it takes about 50 minutes to run. And this is after having optimized it as much as possible. It still is taking 40 to 80 percent of our CPU. You know, it spikes up to 80 percent sometimes. And this is unacceptable. Actually, if I run this on our database after about 10 minutes, I’ll probably get a message from our DevOps team. Someone will notice that I’ve blown up our read-only database. Conversely, in Magento BI, this query runs in 57 seconds. So, really complex query, 50 times faster in Magento BI. And because Magento BI caches the results, the second time that I run this query, it’s virtually instant. Less than a second. Thanks for the example you gave on SQL query performance in MBI. This serves as a testimonial as to how MBI’s SaaS infrastructure is utilized to its best potential. With MBI coming into the picture, the data analyst can now avoid the wait time involved in connecting to the databases, save and retrieve SQL queries and of course, the query response time as you mentioned. Yeah, and this really, it goes further than just having a fast query or not having to run this on a slow database.
As an SQL writer, when I’m discovering new data or trying to solve a new problem, having instant feedback is really important. I would never write a 500-line query and then run it to see what kind of errors I’m going to get back. You know, it’s always small chunks. Small little bit of a query, get the results, take a look at those results, figure out that you actually need to filter a few more things. So, having that instant feedback is really important and if you can imagine, 50 times faster. So, if it takes me on a traditional database, let’s say two or three days to solve a complex problem, theoretically, most of that time is spent waiting for results to get back. If I can move 50 times faster, I can get that done in maybe two hours.
There are a lot of clients who might be looking forward to use a full stack SaaS tool like Magento BI. How do you think MBI can help them transform their business to the next level? So, now I’ll talk about organizational transformation and how Magento BI can play a role in that. And at our own company, we went through a large organizational transformation two or three years ago. And I’ll share some of that story as well. This is a non-optimal process where we had a database, the traditional database, and if someone wants data, we actually had a ticketing system. So, we would do 600 tickets a year of various departments requesting data. Maybe it’s just three weeks of data for this one particular question. Another team wants to check on some of their workers and pull performance data on that. But in any case, we would go to the database and then there needs to be a data worker who creates a query, gets that raw data, puts it into a spreadsheet. We like to use Google Sheets, easy to share. At some point, there’s some sort of filtering and cleansing process with the data. Then we’re going to do some pivots, some analysis, and eventually we get to this final insight.
From a stakeholder point of view, there’s all this waiting. So, you share your data requirements. Maybe you send a ticket. You wait because they have a backlog. So, you might be waiting one or two days before they even begin your questions. Then they send you back the raw data. You see that raw data and most of the time, there’s new questions. Maybe the data is not quite what they expected or now they have new questions. They need to add something to that data. So, now they’re waiting more. By the time you get to this final analysis, it’s outdated. I mean, it is a historical snapshot of what happened at that time, but it’s not an automated report that you could check every month. So, this whole process can span days, maybe weeks. If it’s urgent, maybe you can get it done in a day or two. But this hassle of getting data and this slow process, it can deter some people from using data in their decision making. Conversely, some people can be a little bit paralyzed by this and not want to make a decision until they have the data. And so, that also slows decision making productivity.
So, the optimal flow is what we found in Magento BI. Our company transitioned from a ticketing system to only using Magento BI. And any time a new ticket came in, we would figure out what kind of question that is, build it into an automated report, and give it to that department’s dashboards. We started building dashboards for each department and we would just put it on there. So, then if they ever have that question again, they can self-service that data themselves. Another thing that this allowed us to do was really scale up the amount of problems that we could solve without hiring extra data experts. And just to show about the self-service aspect, so this is a glimpse of what it would look like in MBI. Sometimes it’s a simple spreadsheet type of data that a department wants. So, you can see here in the background, sort of spreadsheet-like. And they can just click here and export to CSV, then they can import it into a Google Sheet, and they can work on this data themselves. So, there were so many times we would get a request for data and we’d realize, oh, we actually built that three months ago for someone else. So, you can, you know, here’s the link to that report. You can go pull the data yourself.
Yeah, good to know how MBI helped your organization to automate the data. I’m sure a lot of people in this audience can relate to practical challenges in data warehousing and business intelligence implementations. One, of course, is the replication of data periodically from source systems, analysis of data using database tools to create datasets, consumption of these datasets to making meaningful dashboards. Even after doing the visualization, sometimes there are iterations that are needed. As you rightly pointed out, that could prove costly. At an enterprise level, there is a constant pressure in making the right data available at the right time. Senior management would be very unhappy as their decision-making gets delayed. Not to mention the business suffering due to this. Yeah, absolutely.
So, a few other tips that I would have in an organizational transformation. So, our company, three years ago, I think we would still classify it as a startup. And with startups, there’s investment money, and that money keeps the company afloat and allows it to scale, but would not necessarily profitable at that time. And there comes a certain point where you have to get profitable as a startup. And we were going through that during the pandemic. And there comes a certain point where you have to get profitable as a startup. And we were going through that transition. Maybe for some other enterprise level companies, it’s not that, but maybe you have challenging KPIs. And I think the best way to achieve those KPIs is to make data transparent, not just to the workers in your department, but across departments. So, what we did is we made certain key dashboards, performance indicators of the whole company, but also individual departments. We started sharing those with other departmental leaders and their teams. And it gave people this feeling that not only are they responsible for their KPIs, but they get a sense of recognition when they move the data in the right direction. And there was this sense of excitement. The first thing everyone does on Monday is they log into Magento BI, and they check on their data and their dashboards. And they get excited to see the full week of last week’s data. And if the dashboards are built in a really nice way, that’s a fun process. And it’s exciting to look at different charts and explore that data. From the data team perspective, we moved away from this code repository. We were using Git. That was something I introduced to our team. Before that, there was not even code sharing going on. So, we had a code repository, but it’s a little bit cumbersome. There’s a learning curve to using Git if someone’s not coming from a development background. And also just remembering to save your code to the code repository after you’ve delivered your results to the end customer. In other ways, maybe you would paste your query into the spreadsheet whenever you deliver that data to a customer. In any case, Magento BI is far superior because every chart and every query that we write in the SQL Report Builder is saved. And so, anyone with edit access can click here, click edit, and then open up and see the code underneath. So, for us, this was tremendously valuable. So, here’s an example of a high-value dashboard that people like to look at. We have a lot of cost savings from a project that we did. But you can see here, I’ve given view access to a lot of people. Janus here has edit access. That’s our CTO, and he’s also an excellent writer of SQL. So, I always give him edit access, and I don’t think he minds showing his name here. But you can see, I can scroll down a lot. A lot of people have view access, and this gets people excited about these projects. So, I highly recommend making dashboards visible instead of siloing your teams into only seeing the need-to-know kind of information. Could you describe what was your team headcount and composition so our audience can relate to the amount of work versus team strength? Yeah, in the past, we had four active team members. Since then, it’s actually grown smaller for various reasons. And so, at the moment, my protégé, who I trained using MBI, has moved into a sister company, so the same founders, but a new startup. And so, he’s managing it entirely over there. And I’m managing our enterprise entirely over here. Because we had four people building all these dashboards at one point in time and automating everything, there came a point where the workload decreased a lot because everything is automated. So, teams are now self-servicing their data, and there’s this single source of truth. So, a lot of the complexities and problems that we were facing three years ago sort of vanished as we relied on MBI more frequently. And I’m able to maintain all of that because I rarely encounter problems or questions anymore. And so, now I’m more focused on helping us with new data roles, like in our marketing aspects. Great to hear that, Shan. Being a data evangelist, I’m sure you must have acquired a unique role I’m sure you must have acquired a unique way of developing reports in MBI. Any specific technique or operating procedure you follow in your team? Okay. So, I want to introduce this idea about common table expressions. What is a common table expression? It’s a style of writing an SQL query that creates a temporary table and it saves that table in memory only while the query is running, and then you can write another query that queries your previous query, if that makes sense, and I’ll show you what I mean by that. So, here’s a very simple query, and I’m going to use this concept throughout the walkthroughs. So, we’ve got a table called item, and we’re getting a list of items when they were created and some other information. To make this a CTE, it’s very simple. I wrap it in parentheses. So, now I’m giving a name, CTE1, to this. I could change that to any other name. In fact, I would recommend a better name, like, I don’t know, items. And then I can write another query below that, and it’s querying off of CTE1. So, CTE1 is not a database table. It is the results of this output. Maybe if you’re familiar with databases, this sounds like a view. You could create a view in the database, but a view remains permanently, and it uses processing power from your SQL server. Whereas this, as soon as this query runs and the query is finished, the CTE1 disappears. It’s just a temporary spot in memory. So, we would get some results from this. And why would we do this style? I mean, this is actually maybe a bad example because there’s only one, but you can actually chain five, six, seven, eight of these CTEs in a row. Well, for one, I think it’s far more readable, and we’ll show some examples of that. Number two, you can spot check it. So, if you have five or six of these CTEs in a row, it’s very easy to comment out CTE three, four, and five, and then select star from CTE2 and see the results of that output. So, you can spot check through the CTEs. It makes it much easier to debug. You’ve got more reusability. It’s easier to copy and paste snippets. And I think it leads to faster problem solving. Maybe some people would say that, you know, if you were writing this for an application, in an application, you need speed and technical excellence, and that is number one. You want the most efficient query possible, and you’re talking about shaving off milliseconds, and you would consider that a big win. But in Magento BI, queries are so fast, it doesn’t matter if a query runs in one second or three seconds, or it’s really complex, so it takes ten seconds, then it’s cached, and after that, it runs in less than one second. So, it’s really, I think, more important, especially for teams, to have readable code that other people can come and see what you intended, and to easily debug that, or to copy and paste part of your query, so I can quickly solve another problem. So, that’s what MBI really empowers you to do, and I think MBI is the perfect tool for common table expressions.
So, now we can do a more practical walkthrough of an example. I want to start this by showing us the final result. What we’d really like to have is a line chart, going through time, so let’s say weekly data, and let’s make it simple. We’ll say a count of a particular type of item, we’ll say furniture items, and just to make it interesting, we’ll say you can’t include returns, so maybe there’s some returned items for whatever reason.
So, we would start by writing the CTE we just looked at, so we’re getting a list of items, the data would come out looking something like this, so you’ve got a list of item IDs, you’ve got your weeks, they’re not in order, you’ve got item type, we don’t know what these numbers mean yet, but they’re here, and we see some prices. I think it’s interesting, I wrote priceUSD is not null, and maybe this is an example of how someone could screw up in a visual report builder, you think you’ve got the right filter, but it just turns out your database also has some zero values. So, now if you take an average of price, you’re going to have zeros in there skewing your average, and if you’re not aware of those zeros, maybe you want to filter these out, maybe it’s a nuance of your database, I think all databases have nuances like this, so we would probably even want to filter out zero prices.
And that’s what this spot check is here, so you can see the dash dash, that means comment, so it’s not going to run, but I can easily delete that dash dash, I can spot check this, and that’s where I would catch those zeros, and then maybe I would update this query. So the CTE2, I’m saying from return item, so in this hypothetical example, this means we have a database table of items that were returned. I would say that’s the best way to engineer this, maybe a database engineer would put a column on the item table, have a zero or one if it’s returned, I don’t think that would be the right way to do it, this would be a nice engineer. So we’ve got a table return items, we get a list of all those return items in the same timeframe, and then we write our, and it might look something like this, so you’ve got the same IDs, and then you’ve got a date. So our third CTE, now we’re filtering on return item ID is null.
So in other words, since we did a left join, it means there’s going to be a null value here if the item was not returned, and since we have this where clause, we have now removed any return items. So, you know, basic SQL concept there. But it’s more readable in the CTE format. You can see what we’re doing, we can name these CTEs, personally, I wouldn’t write CTE3, I would just write returns, something like that. So now we move into our analysis section. And it’s pretty easy to see what we’re doing here. We want data that has the week, the item type, and then we’ll do our functions, so count and sum. The results might look like this. So we’ve got different item types, we’ve got the weeks now, we’ve got the counts, and our question asked for furniture items, so we need to go figure out what are these item type IDs, what do those map to. So in our final, I call this the naked query, so these are wrapped in parentheses in the final naked query, I think that is a technical term, I’ve seen it somewhere. So here is where I would add item type ID equals three, so I’ve, you know, in this example, I’ve done the research to figure out what those item types map to instead of interjoining some item type table and getting the name, I would just say equal three and leave a comment. So I like this style because now if someone comes to me and says, oh, I’d really like to see that same chart but for a different item type, I could copy and paste this entire thing and just change the ID here.
Oh, and so the output might look like this. So why not write this as one dense terse query? This is probably what most people are familiar with seeing SQL queries, and they can get much larger, they can have nested subqueries, and it’s really common to see it written this way, especially in application development.
For one, I think this is less readable, it’s more difficult to understand the author’s intention, why is price USD not null here? You know, it’s disconnected from the CTEs where it’s very illogical problem solving flow. I could probably figure out what this left join is doing, but this return item ID is null, it’s all the way down here, and now I’ve got these two nulls, so I feel a little bit confused about that. Item type ID, that’s pretty understandable, but I’ve got a group by on this, so the whole thing is complex, I guess technically this probably wouldn’t even run, it should be grouped by one and two. So, in any case, this is difficult to understand. If I want to debug it and figure out what’s wrong with it, it’s also going to be more challenging, I’m going to have to copy this left join and open a new window, run that over there, figure out those results, it’s just much more complicated. I don’t think I could reuse anything from this, because it’s so customized to solving that one particular problem, and it’s really not a logical flow. I don’t think anyone writes queries like this the first time. Usually they’re writing smaller pieces of this query, and then they stitch together this beautifully dense query, and it feels nice to do that sometimes, and it’s impressive, and someone else looks at it and they don’t understand any of it, and they say, wow, you’re really smart, because you wrote this difficult to understand query, but I don’t see it that way, I’d rather it be readable. So, let’s do another example, and this will be a little bit more in depth, I think more practical. So, the end result, what we’d like to have is here, a stacked bar chart. So, it’s going to be a count of some items per week, and we’re going to not include returns, just like last time, but we want to stack it, and let’s say there’s locations, so maybe there’s warehouses, or three offices, or if we adapted this example to, let’s say, a marketing team, maybe it’s three different websites, or three different web pages for customer conversions, it could be any example, but in this case, we can copy and paste everything we just did, the items, and we filtered out returns, and it’s really clear and easy to read that, especially if we’ve labeled our CTEs descriptively. So, copy and paste that, and jump right into the analysis, but there’s one more thing we need, we said we want facility location, so we’re just going to open up our CTE1, and add facility ID here, and we don’t need to change anything else, because you can see down here in CTE3, it says CTE1.star, meaning all of the columns from here, so this is really a nice, well-written query, sort of a, I call it a master query, so a master query is something that’s really easy to adapt to solve many problems. So, we write our fourth CTE, and this is the analysis part, and at first glance, I think an application developer would frown at this, it’s kind of cringey, when you see a wall of case win statements embedded in a function, but Magento BI is so fast, you can do this, it’s okay to not be technically excellent in Magento BI, because it’s going to run so fast, every column is indexed, you don’t need to worry about that. So, this is more of a master query style, because we can now answer a lot of questions with one query. So, the output of this, it might look like this, we’ve got our weeks, facility IDs, three different facilities, one, two, and three, and we’ve got our counts of various product types, or item types. So, our final naked query, to finish the result, the final one is always naked and never wrapped in parentheses. So, we’d say select all from the analysis, here I’m saying, don’t show me this week, I only want to see full weeks of data, so get today’s date, find out what week that is, and in the chart, only show weeks that are less than this current week. And so, then we get a nice chart like this. But, in this example, we’re using facility ID, and maybe not everyone in the organization knows what facility one, two, and three is. So, we would do a little extra modification here, enter join, we probably have a table called facility, or maybe it’s office, or warehouse, something like that, a reference table. We modify our query slightly, and then we get a chart like this, and in Magento BI, when you hover on one of these bars, you’ll get a pop-up showing your data. So, you can see in this case, I’ve made it say accessories count, we didn’t specify in the question, but accessories count, it would be so easy to adapt this to show something else, it could be furniture count, or handbags count. So, you’re doing this in the SQL report builder, it has a way of making charts super easy, and here we have the facility name instead of the ID, we have the count, and the week. Yeah, I like the way you organize the use case, Shan. Starting from an end objective, and systematically breaking down the ask in terms of separate CTE chunks. In addition to making the query run faster, CTEs would also help make it more readable and reusable within the team. Often there is a need to create the dataset based on which further analysis could be done, or visual reports could be built. Being temporary in nature, I think CTE output does not occupy physical space in data warehouse, thereby saving efforts in cleaning them up periodically. Yeah, I agree, and actually I want to expand on that in the next slide.
So, I want to show you how great this chart builder is for the SQL report builder. So, the visual report builder is a little bit separate and probably more intuitive to use, but for someone who is writing their SQL code, the results tab is always, you know, you could think of it, it looks like a spreadsheet, it’s the raw data. So, it’s so easy, you click on the chart tab, and then you have three drop downs. And I remember when I first started making these charts, I was a little bit confused. Do I click label, what should be the series, what do I put under category? And usually I would just click around with trial and error until I got it right. But I’m going to give away my secret for how to understand this three step process. Step one, start with the label. And the label is always the time series. So, it is the x-axis. And I think probably 95% of the reports that I build in RGMetrics are always weekly or monthly or daily data. And if someone asks for, oh, I just want this three week chunk of time and that’s all I need, well, I’m going to build that into a weekly report. I’m not going to waste my time just making a one time analysis of those three weeks because I feel sure that person is going to come back with more questions. So, I’ll just go ahead and automate it for them. So, step one, label. You click that drop down, you’ll see the options here. This is an example from a dashboard I actually have. So, I created a measurement for waste time. And so, here I’m going to click on month. And right now, the chart is still, it doesn’t make sense. It still needs more. It has to have a series. So, that’s step two. Step two is what we’re measuring. It’s the y-axis in most cases. Maybe if you change to one of these other report styles, it might be a different axis, but in general, it’s always the y-axis of what we’re measuring. So, I would open that drop down, I would select what I want, in this case, waste time. And then step three, this only applies to stack charts. So, there’s a few different types of stack charts in Magento BI. I never use category unless it’s a stack chart. So, in this case, I would choose offices. And now we have a beautiful visual. So, you were mentioning before about how this is powerful for enterprises. Well, if you’re an enterprise, and you’d like to have this data report, and you don’t have Magento BI, you need to ask a developer team to build a custom dashboard just for you, and add that to your employee-facing tools. Well, that’s really expensive. Not only is it expensive, but if your requirements change, and you want a slight modification to that, that’s going to be another expense. And then maybe after six months, this data report’s not really important anymore. This is actually a really good example. We fixed our waste time problem, and now no one’s looking at this report anymore, because it’s already been successfully handled. So, now that would be a sunk cost that we spent on a report that we don’t need anymore. I think that is why executives like to have a business intelligence team or a few data scientists on hand, so you’re not creating a developer swim lane to build endless reports and dashboards for employee-facing tools that are constantly changing and becoming outdated. Instead, you have Magento BI, and you can build a limited amount of reports using just one or two data experts. You know, in my experience at American Airlines, I would say that’s under the umbrella of big corp and, you know, like giant enterprise level, and dev swim lanes are slow. There’s so much bureaucracy, and there’s a lot of approvals that these things need to go through, and there’s huge backlogs. So, if you wanted this data report, and you didn’t have Magento BI, and you didn’t have a team of data experts or business intelligence, and you’re trying to go through a developer path, you might not get this report for six months, eight months. That would be, I think, typical at American Airlines enterprise level type of organization, but if you have Magento BI, you could have this in, I think, an hour, depending if there’s a backlog or your urgency. There’s many times I’ve been pulled aside on something urgent. Can we have this before 3 p.m., and it’s 11 a.m. my time? So, I’d say, absolutely. I can do that in about 30 minutes. Actually, do you want to stay on the call with me, and I’ll share my screen, and we’ll just do it together? Magento BI is that fast, and that kind of collaboration actually helps people explore the data even further than their original request. Personally, I’m not shy about writing SQL with other people watching, so I like to share my screen and have the stakeholder stay on a call with me, and so the hour meeting that they booked is not an hour spent describing the problem to me and discussing it. Instead, it’s an hour spent actually actively solving the problem. All right. Thank you, Deepak and Shan, Shan, for that great presentation. There’s lots of good info there. Remember that we have the second session of Shan’s presentation this coming Thursday at the same time of day, and the same link. Everyone registered for today’s session will get a reminder email for Thursday’s session in a couple of days. All right. Now, let’s move on to the Q&A portion on the topics that Shan covered today. Please submit any additional questions in the Q&A pod on the left side of your screen, and also please submit your feedback in the poll questions now showing on your screen. These are really helpful for us to improve our content in the future. If you have questions after the webinar, or if we don’t get to your question during today’s Q&A, please feel free to send an email to mbi-webinar at adobe.com. We’ll receive questions there for about the next week. You will get a follow-up email in the next day or two with a link to a recording of today’s webinar and to the resources library where other webinar recordings are already posted. It will also contain the email address that I just mentioned for follow-up questions. All right. So, let’s see. We’ve got some questions that have come in. Okay. I think this first one is – looks like most of them are for Shan. Got just a little bit of time here for questions. As a novice with SQL, what is the fastest way to build good SQL skills? The fastest way is definitely air programming. There’s no faster way to learn SQL than to work with an experienced SQL writer and learn from them. So, when I say pair programming, I don’t mean they need to be sitting side by side each other, but it means the novice and the expert should have an internet call. The novice should screen share their computer. So, the novice is actually writing the query, exploring the database, and the expert is backseat driving and helping the novice work through some problems. This can feel a little bit painful in the beginning, you know, telling the novice, oh, you missed a comma, not there, but over there, yeah, up two lines, line 27, okay, now break a line, you know, things like that. But this initial pain, it dissolves quickly after the first two or three days of pair programming. Everything starts to fall in place and you start solving problems together very efficiently. And also, maybe 10 to 15 minutes per day, the novice can watch the expert write some queries, but it’s really more important that the novice do most of the typing. So, it’s a lot like learning algebra or calculus. You can watch the teacher solve some problems and you might think you understand it, but you don’t really integrate those skills if you don’t personally write and solve those problems yourself. So, just two or three hours per day in this pair programming style, writing meaningful queries, this will quickly convert a novice into an intermediate SQL writer who’s totally self-sufficient after a week or two, sort of like a boot camp for SQL training. And from that point, the person can teach themselves more and they just need practice to solidify those skills and work their way towards mastery. So, the novice, they need to feel comfortable with joins and group buys in the CTE style, and after that, they can self-learn a lot of the rest with some practice. Okay, great. Some good tips. All right, next question says, writing simple queries is easy, but sometimes the query seems correct and even provides data, but the data is wrong because of something in the query. How can we know the data is correct? This might be one of the most important issues with writing SQL, getting data results that appear correct but are actually flawed, and this can easily happen with visual report builders too. So, this issue is a good reason to always have at least one SQL expert in the organization or team and have code reviews or report reviews with that expert before using that data in decision making. So, when looking for those possible errors, the most common error is probably using an inner join when a left join should have been used instead, and the second most common error is probably in the where clauses, like forgetting to add a filter on something like a status ID or a type ID or removing null values or having duplicate rows. So, always look out for those, and if the query is written as a CTE style, it’s easy to spot check those CTEs and look for those errors, and it’s also easier to avoid those errors when writing the CTE style. Okay, great. I think we’ve got time for just one more question here. This is also for Shan, so it’s a little longer. The data team receives, I guess their data team receives many requests for data, but the requests are often in different formats, sometimes a long complex paragraph, sometimes a short ambiguous description, which leads to a lot of back and forth communication before creating a data report, or it sometimes leads to a data report that does not fit the expectation of the stakeholder. Do you have any recommendations for improving this process? Yeah, we’ve definitely experienced those issues also. It’s important to give stakeholders some expectations about what you expect in a data request. Personally, I like to have two things in my data requests. I’ve communicated this preference to all departments. The first part is that I want to have some context about why the stakeholder wants this data. Give me some context about the problem they are trying to solve or how the data will be used, and this helps me have a holistic understanding, and I can often include new insights the stakeholder didn’t know existed, and this usually makes stakeholders happy and feel like the data team is truly supportive. The second part is not required, but if the stakeholder is wanting a spreadsheet of raw data, maybe to do their own analysis separately, I like to see a spreadsheet example of the columns they expect to receive, mostly because I think this helps stakeholders think about the problem and clarify their own thinking before submitting the request. Okay, so if you feel you can answer this question that just came in sort of quickly, we have just a couple minutes left. I want to make sure we respect folks’ time and don’t run over. It says, how did you manage 9,000-plus reports and 500-plus dashboards? How many users did you have overall? Did you use a separate tool or manage directly from your MBI account user by user? So maybe you can cover at least part of that. Yeah, good question. We have over 100 users, and we only use the MBI administrative tools that come out of the box with Magento. It’s as simple as adding a company email address. They’ll receive the email, and then they can log in. I think it could be any email address, so it doesn’t necessarily have to be within your company. Maybe if you have some contractors working outside of the company, they can also join, just add their email address. They get an email, they log in, create a password. So one trick we do use for managing those users is we’ll put their department name, like marketing or customer support, we’ll put that department name in parentheses beside their username. This helps when sharing dashboards, and you have over 100 different people you could possibly share to. This helps make sure you share to the right person. Some people have the same or similar names in different departments. Most people create their own personal dashboards, like a playground or a sandbox, and it’s private. They’re not sharing that with many people. And it’s great for exploring data and creating reports and sitting on those for a while while you vet the data. We generally don’t consider a report verified for decision making unless it has been created or reviewed by the business intelligence team. And at one point, we decided we have too many unused or outdated dashboards and reports. It’s really easy to have a lot of fun in Magento and create all sorts of things that maybe seem interesting at the time or are relevant for a few years ago but are not relevant anymore. So we decided we are going to have a formal cleanup process. We sent a support ticket to Magento. They have a great support team, very responsive. And we asked about our unused dashboards or if they could give us any insights to help us in this cleanup process. So they gave us a spreadsheet list of dashboards that did not have views in the last 30 days and in the last 90 days. They were able to give us the emails of the users who created those dashboards. So then we could reach out and we could ask those people if those dashboards are still necessary or could they be deleted or could some reports be removed from those. And we removed a few dozen dashboards that way. And then we worked with department leaders to review the dashboards that we want to keep but delete some unnecessary reports from those. And that just helps people find data that they’re looking for more easily when they’re not distracted with outdated reports. All right, great. Thank you so much, Shan. And thanks to everybody who attended. We are out of time for this webinar. Hope you got some really great information today and we hope to see you again on Thursday. Have a great rest of your day.