Creating an Analytics and Dashboard Service Over Breakfast
-
Posted by Harry E Fowler
- Last updated 6/27/19
- Share
Microsoft’s cloud-based Power BI service makes it easy to move data into the cloud and provide dashboards, visualizations and ad-hoc query capabilities to others in the same organization. Power BI is power to the people to ask questions and get answers.
Motivation
Recently, I’ve taken to analyzing sales data using a combination of Power BI and SQL Server. I chose Power BI because I’ve seen demos of it at the Professional Association of SQL Server (PASS) events and wanted a chance to experience the tool for myself.
I had tried Power BI earlier using Excel spreadsheets as a data source. Then I took another run at it with a SQL database on the back end. I’m good with SQL and love the combination of doing the heavy lifting through views in SQL, and then the visualizations and lighter-weight queries from Power BI.
Getting Data into SQL Server
Data comes to me in the form of Excel spreadsheets from my company’s I. department. I can’t show our corporate data in this article, so any data you see in the examples to follow is instead drawn from the City of New York restaurant inspection data.
First, I remove the clutter from each Excel file that I’m given. There are header rows and floating buttons and other visual elements that I select and delete. I also delete unwanted columns from the left edge of the spreadsheet.
I save what remains as a comma-separated values (CSV) file, and load from there into SQL Server using SQL Server Management Studio’s Import Data wizard. Figure 1 shows some of the restaurant inspection data as seen through that wizard during an early step in the load process.
I keep a one-row CSV file on the side that has my target column names in it. I paste that row at the top of each CSV file that I load into the database. Then I just tell the wizard that the first row contains the column names. This little step makes my job easier.
Making the Data Usable
Loading from a spreadsheet into a table leaves me with essentially a spreadsheet stuck inside the database. That’s not too helpful, so I next run a set of scripts to move the data into a collection of somewhat more normalized tables that are useful for querying against.
Figure 2 shows the tables that I create and populate from the inspection data. Each restaurant location gets a row in the restaurants table. Each inspection of a restaurant generates multiple finding rows in the inspections table that are stamped with the date the inspection took place. The violations table translates finding codes into human-readable descriptions.
Also on view in Figure 2 is a reporting view named findings. Creating such views allows me to put a friendlier face on the data, to make it more understandable and accessible to my colleagues. I’m also able to pre-answer questions that might be difficult for people unaccustomed to SQL to answer.
For example, I make Top N queries easier by providing a view that gives each of our books a sales ranking. My colleagues can see our top 10 best-selling books by specifying rank <= 10, our top 20 best-selling by asking for rank <= 20, and so forth. Some views provide specific answers to specific questions, whereas other views provide easier solutions to certain classes of question.
Loading to Power BI
Power BI’s default behavior is to pull the data you care to explore into a Power BI workspace. Figure 3 shows a refresh operation in progress, in which the restaurant inspection data is being reloaded to ensure that I have the latest copy of it. You can see the reporting view findings expanded in the right-most pane, showing the available columns in that view.
You do have the option of leaving the data in the database. At first, I chose that option, as it seemed obviously to be the better choice. Later, I warmed up to the idea of loading the data into my Power BI workspace. You’ll see shortly why I changed my thinking on that point.
Analyzing and Visualizing
Now I can explore the data to my heart’s content. Figure 4 shows the New York City Fly Cycle, which is a fun little pattern I found by plotting violations by type and by quarter. The chart shows that inspection violations involving flies peak each year in Quarter 3. In fact, fly violations for 2015 and 2016 peak both years in July, and then slowly descend through September.
Rats, by the way, are consistently the least cause of violations. Rats are the yellow line at the bottom in Figure 4. Mice are the larger problem, as seen by the orange line at the top.
Create good reporting views, and a lot of the work on the Power BI side becomes drag-and-drop. It’s easy to create charts, for example, tallying violations by borough, by cuisine, by type of violation. I can explore whatever questions come to mind as I review the data set.
Sharing the Love
Ease of sharing is where Microsoft has hit a mark with Power BI. I had gone into my sales analysis project thinking I’d have to sort out: how to create a SQL Server database on Azure; how to link that with my Power BI worksheet: how to push my data from a local database into my cloud database, etc.
I couldn’t have been more wrong.
All it took was for me to create an account on PowerBI.com, and then a shared workspace. I hit the “Publish” button that you see to the right side of the toolbar in Figure 5. Job done!
I literally, over breakfast one morning, pushed my workspace into the cloud and provided access to my colleagues. I began with a workspace like the one in Figure 4. Then in less than one cup of coffee’s time, I provided my colleagues access to explore and query the data for themselves.
That’s powerful!
Lessons Learned and Future Plans
Power BI is all about self-service business intelligence. Projects can be started at the grassroots level by anyone with vision and energy. However, IT departments can eventually exert control. Microsoft has built the service such that the domain owner can exert control, and shared workspaces are restricted to email addresses within only one domain.
One can be tempted to release too early; I was. My initial enthusiasm had me immediately wanting to share the project. Fortunately, I was distracted by a business trip for long enough to realize that I really ought to polish my reporting views and work out bugs in them before getting too many others on board. Sharing is good. However, sharing too early means you might share something that’s buggy.
Managing access by multiple users presents problems I’ve not yet fully solved. How do I push to a group workspace and ensure that my data remains read-only? How do I provide predefined reports to the group workspace for others to rely upon as being correct and accurate while protecting those same reports from modification? Answering these and other questions on managing Power BI across a department or a company is my next focus area as I explore what the product can offer.