Hi, this is Sofie from ThoughtSpot, and I’m just going to go through a quick demonstration of the Data Connect capabilities that are new in 3.3.
So I’m going to go ahead and log in here as an admin user. I’m going to flip to my data tab over here. And you can see actually that I have pulled in a number of tables already. This is coming from an external SQL server instance that I have loaded the standard retail demo data and schema into. But we’re going to walk through this process one more time just to show you how it’s done.
So I’m going to go into add a New Data Source. As I’ve said, I’ve already created a connection between a SQL server instance that’s currently running in AWS in a Windows environment, and it has the full retail schema. I’ve also created a connection to a sandbox instance of our SalesForce production account. So this has our full schema, but it does not have any of our internal data.
So we’re going to select SQL Server. Click Next. You’ll see here if I click down you can see the existing connections that exist. You could also create a new connection, though for the purposes of this demo we will not be doing this.
So I’m going to select the SQL Server Production connection, and now I can go through and select each one of these tables that I want to bring over in my job. You can see that if I select one of these tables I can unclick some columns if I so desired, so I don’t have to bring in the full table.
Now we are brought to this Transformation stage where I can add some filters to my data or any expressions to add columns or modify existing columns. I’ll show you how this is done. If I add a filter to say the Store_Dimension table, I can filter by a particular column. So, we’re going to filter on Store_Region. And you’ll see here that there is a list of operators that I can use for my filter. Here I’m going to select EQUALS. And we’re going to filter on store regions that equal west, so we will only retrieve rows where the store region matches this value.
Now if I wanted to add an expression, I can again select the table that I want to add the expression to and I can also select an existing column. So, if I wanted to add an entirely new field to my table, I can add a new column. Alternatively, I can modify one of my existing columns, Customer_Key for example. I can concatenate customer ID in front of the actual customer number for more clarity. In this case though, we’re going to add a new column, and going to name it percent_gross_margin. So I’ll show here, writing out a formula, going to use the Sales_Dollar_Amount column subtracted by cost, divide that again by the cost, and multiply it by 100. And this will give us the percent gross margin. So I will add that.
I’m also going to show an example of a different type of function that you can use across the Customer_Dimension table and again that Customer_Key. We’re going to modify this and as I’ve brought up earlier, concatenate customer ID in front of the actual customer key. So it’s a little more clear what the values are here. And then we’ll add that column.
Now if I click Next, these transformations will be made to the data upon my job. You do need to create a unique name for each source that you bring in. So here we’re going to call this our full retail schema. Say Done. Note that this name must be unique across all of your jobs. Otherwise you will get an error.
Now I’m not going to run this on a schedule, but if I wanted to I could select a date that I wanted this job to start, and also time interval. And I could also select it to repeat on a certain interval, every minute for instance, on certain day, during a time range, and have it repeat indefinitely or until a period of time. I could also schedule this on an hourly basis, daily, weekly, where I decide the particular days or day of the week, or monthly. Again, for this demo we’re not going to have it repeat. And note that if this is loading to an existing table, you can truncate the data or you could just upsert it if you just leave this blank.
To view the full schema, you can click here, and you’ll actually be able to make changes to this as well. But for the time being, we’re just going to load this as is. And it takes a little while to kick off the job, but soon we’ll be brought back to our Data Sources tab.
So here you can see, this is the job that we just created, and we have a load status. I want to note that this status process is getting updated from Informatica Cloud, so it may not always be up to date. But you can look at an existing job here, and it’ll show you a success status for tables that are effectively loaded and then also in progress for those that are not. When you do get that success status, you can see how many rows were actually loaded in the data set as well.
One of the best ways to really tell though whether data has been loaded and if the job has been successful is to go back to you table view, and then look at the tables themselves. So we take a look at Customer_Dimension table. We can flip to our data tab. And we see here that, the information has indeed been loaded, and we even see this CUSTOMER_KEY field has been modified appropriately, with this customer underscore ID concatenated in front of it.
And that’s about all I wanted to cover for now about the Data Connect demo, so thanks for listening.