Integrating Python with Power BI

Microsoft has recently added Python support to Power BI Desktop. This is available as a preview feature for Power BI Desktop, yet a nice feature for those who use Python on a day to day basis. Python is an open source general purpose language and has become one of top programming languages for data science. Power BI is Microsoft’s flagship BI tool for creating dashboards and interactive reports.

Similar to R, with Python as a backend engine for Power BI Desktop, you can do many cool things, which can be categorized into three broad applications–

  • using Python to import data into Power BI,
  • using Python to query existing data in Power BI, and
  • creating visuals with Python script

In a previous article, I’ve demonstrated how to use R with Power BI. In this article, I demonstrate how Python can be integrated with Power BI to perform these applications.

Application 1: Using Python to import data sets into Power BI

Power BI can import data from a variety of sources. For a complete list, please see here.

Importing Excel .xlsx or .csv files into Power BI is perhaps the most common usage. However, I’ve personally found it to be problematic. In particular, if your source data is continuously updating including addition or removal of columns, then the changes do not get updated when you simply refresh the data sources from Power BI Desktop. This is a problematic when you are developing your interactive report and also when the dashboard is in production.

A robust solution of importing data is to use Python (or R) script. If you WORKING here

Preparing the Python script

This is the stage you would write your Python script to load your data. You can use any IDE for Python. I use Jupyter Lab for that. The goal is to test your code before using in Power BI.

I am going to create a smiple data set with two columns using the Python codes below. This data is being created for demonstration purposes only so that you can reproduce it when you are learning.

# Sample dataframe object in Python
# You need to load Numpy and Pandas libraries
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(50, 2), columns=list('xy'))
##           x         y
## 0  0.883893  0.195865
## 1  0.357537 -2.343262
## 2 -1.084833  0.559696
## 3  0.939469 -0.978481
## 4  0.503097  0.406414

This code creates a Pandas DataFrame and saves in an object df. This will be the name of the data when you import into Power BI. You can give the data object any valid Python variable name you want.

The above code also shows df.head(), to demostrate that the data indeed have been generated. Power BI only imports the DataFrame obejct. In this case, it will only import the df object and ignore the rest. So you don’t have to worry about other codes in the script. Just make sure all codes run without error and without any dependencies.

Enable Python script in Power BI

Since Microsoft has recently release the Python option in PowerBI, it is a feature that you have to enable from under Options. Go to Files > Options and Settings > Options > Preview features. Check the box for Python support. See the image below.

Fig. Enable Python Preview Feature in Power BI

Fig. Enable Python Preview Feature in Power BI

Once you enable the Python feature, you may have to close Power BI and reopen to see the Python scripting option. Once you restart Power BI, you will go to Python scripting option and then select the path to the location where Python is installed in your PC. For me, I use Anaconda software to manage Python packages. So it was under C:3. Also, it is optional to set the default IDE for Python. It was set to Visual Studio Code. It doesn’t really matter which IDE you use.

Fig. Set Python path in Power BI

Fig. Set Python path in Power BI

Run your Python script and import data

In Power BI Desktop, the Python Script data connector is found in Get Data menu. Click on Get Data > More…, then select Other > Python script as shown in the following image:

Fig. Get Data using Python script

Fig. Get Data using Python script

Fig. Python script interface in Power BI

Fig. Python script interface in Power BI

Notice that at the bottom of the above image, the your Python installation path is displayed. If you have multiple installation of Python, then you would need to go to Options and Settings > Options > Python scripting and then setting the location of the Python installation you would like to use.

Click OK, and it should open the Navigator screen as shown below.

Fig. Python script interface in Power BI

Fig. Python script interface in Power BI

Loading the data

Once you are at this stage, you select the data frame created with the Python script (in this case df), and then click Load (or Edit if you want). This should load the data in Power BI. You can verify that in the Fields pane that the df has indeeded loaded as shown in the folowing:

Fig. Python script interface in Power BI

Fig. Python script interface in Power BI

Now that you’ve successfully imported the data using Python script, you can use the variables the typical way you would work within Power BI Desktop. If you want to add or edit any new column/variables or anything with the data using the Python script, you would need to update the Python script and load the data as usual.

Updating the Python script

It is obvious that you would be updating the Python script since that was the whole purpose of importing the data via Python script. You continue to edit the script in your favorite IDE and when you are ready to update the code within Power BI, you need to go to Edit Queries option.

Simply right-click on the data under the FIELDS pane, and select Edit query. This should open the Power Query Editor and you can click on the gear icon ([1] in figure) as shown below:

Fig. Edit Python script

Fig. Edit Python script

Clicking that gear icon should open the R script editor where you can replace the existing code with the updated code. When you are ready, click OK, and then click Close & Apply ([2] in the figure) in the Power Query Editor to apply the changes.

Application 2: Using Python to query existing data in Power BI

This application is useful for manipulating an already imported dataset. That is, data have been imported into Power BI Desktop, and now you want to do some cleaning or perform transformations. Essentially, you could do all of those while importing the data (Application 1).

I am giving an example to demonstrate how it works.

Example to demonstrate re-coding

Suppose we’ve imported a data set with missing vlaues and unusually large numbers in some of the fields. We want to perform some cleaning. Of course this is a fictitious example just to demonstrate the idea.

We use the Python code to generate the data

data = {
    "name" : pd.Series(["John", "Arif", "Samira"]),
    "sex" : pd.Series([1, 1, 2]),
    "gpa" : pd.Series([4, 3.9, 3.95])
df = pd.DataFrame(data)
##     gpa    name  sex
## 0  4.00    John    1
## 1  3.90    Arif    1
## 2  3.95  Samira    2

We will import this data using the Python (discussed in Application 1). Once loaded, we will then do some transformations. To keep it simple, we will just recode the sex variable to indicate Male for 1 and Female for 2. For that, we use the following codes which we would enter in the R script:

# 'dataset' holds the input data for this script
dataset["sex"] = np.where(datasetdf["sex"] == 1, "Male", "Female")

Entering Python script to transform data

  • Click on the Edit Queries button in Power BI Desktop to open the query editor.
  • Select the appropriate query under the Queries[] menu on the left of the screen.
  • Click on the Transform menu above the ribbon.
  • You will see the Run Python Script button with the Py icon. Click that.
Fig. Python script Window

Fig. Python script Window

Enter the Python code that will be used to do the transformation. The screen would look like the following:

Fig. Python script Window

Fig. Python script Window

Click OK.

The screen would transform to the onces that looks like the following.

Now you click on the Home menu and hit the icon that says Close & Apply. Now the sex variable in the table should be a character variable with values Male and Female instead of 1s and 2s.

And you are done!

Application 3: Creating visuals with Python script

One of the best applications of Python with Power BI is that you can harness the power of two popular librries– matplotlib and seaborn, to create high quality graphs and display them within your report/dashboard in Power BI. All the filters/slicers will work as usual and the graphs will the dynamically updated when you slice your data.

As of posting of this article, you can only create Python visuals using Python script. I’m certain in the near future some caned visuals will be available from Microsoft AppSource.

In the following, I’ve demonstrated how to use your own code to create visuals with Python and display them in Power BI.

Example with tips data

This assumes that we already have a dataset in Power BI. For the sake of reproducibility of this example, I will create the data. This time, I will just load the tips dataset from the seaborn library.

Simply follow the steps in Application 1, using the code entered in the Python script window:

import seaborn as sns
tips = sns.load_dataset("tips")
##    total_bill   tip     sex smoker  day    time  size
## 0       16.99  1.01  Female     No  Sun  Dinner     2
## 1       10.34  1.66    Male     No  Sun  Dinner     3
## 2       21.01  3.50    Male     No  Sun  Dinner     3
## 3       23.68  3.31    Male     No  Sun  Dinner     2
## 4       24.59  3.61  Female     No  Sun  Dinner     4

This loads the data in Power BI as shown in the figure below.

Fig. tips Data from seaborn library

Fig. tips Data from seaborn library

Now, we would like to create a boxplot of total_bill by the day of the week. The code is given below.

To creat a Python visual, click on the Py icon from VISUALIZATIONS pane as seen in the above image (the icon is pointed with a red arrow).This should create a placeholder for a visual in the canvas and an Python script editor will show up at the bottom of the screen.

Select the variables you want to use for the visual. In our case, we would need total_bill and day. Once you select them, the Python script editor will update with some basic codes (uneditable).

Fig. Python visual script editor

Fig. Python visual script editor

Now you are ready to enter the following code in the editor. Enter the code below:

tips = dataset
import seaborn as sns
import matplotlib.pyplot as plt
ax = sns.boxplot(x="day", y="total_bill", data=tips)

Notice that, you have to name the data as dataset no matter what your data’s name actually is. Since you would normally write the code outside of Power BI for testing/debugging purposes, we would create a new variable tips' and assign thedataset` to it.

Then click on the little play button near the top of the script-editor to display the plot on canvas.

Fig. Python visual editor with code

Fig. Python visual editor with code

This will create the visual as expected. Now play with the code and make enhancements as desired.

Hope this tutorial was useful for you. If you have any question or have some tips, please leave them as a comment.

I appreciate if you share this via social media. Thank you for reading.

comments powered by Disqus