Connecting Power BI with Python & R
Power BI supports Python and R for advanced data transformation, statistical analysis, and custom visualizations.
Python in Power BI
Python for Data Transformation (Power Query)
Use Python scripts in Power Query to clean and transform data.
Setup:
- Install Python (3.7+) and required libraries (pandas, numpy)
- In Power BI Desktop → File → Options → Python scripting
- Set the Python home directory path
Using Python in Power Query:
- Home → Transform data (opens Power Query)
- Select a table → Transform → Run Python Script
- Write your Python code:
# The input data is available as a pandas DataFrame called 'dataset'
import pandas as pd
# Example: Remove outliers
Q1 = dataset['Amount'].quantile(0.25)
Q3 = dataset['Amount'].quantile(0.75)
IQR = Q3 - Q1
dataset = dataset[(dataset['Amount'] >= Q1 - 1.5*IQR) & (dataset['Amount'] <= Q3 + 1.5*IQR)]
- Click OK → Select the output DataFrame → Apply
Python Visuals
Create custom visualizations using Python's matplotlib, seaborn, or plotly libraries.
Steps:
- In the Visualizations pane → Click the Python visual icon (Py)
- Add fields to the visual's Values well
- A Python script editor appears at the bottom
- Write your visualization code:
import matplotlib.pyplot as plt
import seaborn as sns
# 'dataset' contains the fields you added
plt.figure(figsize=(10, 6))
sns.boxplot(x=dataset['Category'], y=dataset['Amount'])
plt.title('Sales Distribution by Category')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
- Click Run → The chart renders in Power BI
Python Libraries Commonly Used
| Library | Purpose |
|---|---|
| pandas | Data manipulation and analysis |
| numpy | Numerical computing |
| matplotlib | Basic plotting and charts |
| seaborn | Statistical visualization |
| scikit-learn | Machine learning models |
| scipy | Statistical testing |
| statsmodels | Statistical models and tests |
---
R in Power BI
R for Data Transformation
Similar to Python, R scripts can be used in Power Query:
- Transform → Run R Script
- Input data is available as a data frame called
dataset - Write R transformation code:
# Example: Add calculated column
dataset$ProfitMargin <- (dataset$Revenue - dataset$Cost) / dataset$Revenue * 100
R Visuals
- Click the R visual icon in Visualizations pane
- Add fields → Write R plotting code:
library(ggplot2)
ggplot(dataset, aes(x=Category, y=Amount, fill=Category)) +
geom_bar(stat="summary", fun="mean") +
theme_minimal() +
labs(title="Average Sales by Category")
Comparison: Python vs R in Power BI
| Feature | Python | R |
|---|---|---|
| Data Transformation | ✅ Power Query | ✅ Power Query |
| Custom Visuals | ✅ Python visual | ✅ R visual |
| Machine Learning | ✅ scikit-learn, TensorFlow | ✅ caret, randomForest |
| Statistical Analysis | ✅ scipy, statsmodels | ✅ Built-in (native strength) |
| Community/Libraries | Larger ecosystem | Strong stats community |
| Learning Curve | Easier for beginners | Steeper for non-statisticians |
| Performance in PBI | Good | Good |
Limitations
• Python/R visuals are static images — no interactivity (hover, click) • Scripts run locally in Desktop and in a sandbox in Service • Power BI Service requires a Premium capacity for Python/R visuals • Script timeout: 5 minutes in Power BI Service • Cannot access the internet from scripts in Service (sandbox restriction)
Best Practices
• Use Python/R for analysis that DAX cannot handle (ML, advanced stats) • Keep scripts simple and fast — avoid heavy processing • Use pandas/tidyverse for data wrangling, not raw loops • Test scripts outside Power BI first (in Jupyter/RStudio) • Document your scripts with comments for team members • Consider AI visuals in Power BI before writing custom scripts