-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME.Rmd
More file actions
165 lines (119 loc) · 5.72 KB
/
README.Rmd
File metadata and controls
165 lines (119 loc) · 5.72 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# fabricQueryR
<!-- badges: start -->
[](https://github.com/kennispunttwente/fabricQueryR/actions/workflows/R-CMD-check.yaml)
[](https://CRAN.R-project.org/package=fabricQueryR)
<!-- badges: end -->
'fabricQueryR' is an R package which helps you to query data from Microsoft Fabric in R.
It comes with four methods which help you to get your Microsoft Fabric data into R:
1. Create a connection to a SQL endpoint (e.g., from a `Lakehouse` or `Data Warehouse` item): `fabric_sql_connect()`.
This results in a 'DBI' connection object which you can execute SQL queries with, and/or
use with 'DBI'-compatible packages like 'dbplyr'.
2. Execute a DAX query against a Fabric/Power Bi `Semantic Model` item: `fabric_pbi_dax_query()`.
With this, you can run DAX queries against a Fabric/Power Bi dataset and get the results as a 'tibble' dataframe.
3. Read a Delta table from a Fabric `Lakehouse` item: `fabric_onelake_read_delta_table()`.
This function downloads the underlying Parquet files from the Delta table stored in OneLake (ADLS
Gen2) and returns the data as a 'tibble' dataframe.
4. Execute a Livy API query: `fabric_livy_query()`.
With this, you can remotely execute Spark/Spark SQL/SparkR/PySpark code in Microsoft Fabric
and get a list with the results in your local R session.
## Installation
You can install the development version of 'fabricQueryR' like so:
```{r, eval=FALSE}
if (!requireNamespace("remotes", quietly = TRUE)) {
install.packages("remotes")
}
remotes::install_github("kennispunttwente/fabricQueryR")
```
Or, install the latest release from CRAN:
```{r, eval=FALSE}
install.packages("fabricQueryR")
```
## Usage
See the [reference](https://kennispunttwente.github.io/fabricQueryR/reference/index.html)
for the full documentation of all functions.
Below is a code snippet showing how to use the four methods to get data from Fabric into R:
```{r, eval=FALSE}
# First find your 'tenant' ID & 'client' ID (app registration) in Azure/Entra
# You may be able to use the default Azure CLI app id;
# this will be automatically used if you do not set 'FABRICQUERYR_CLIENT_ID'
# The AzureAuth package is used to acquire tokens; you may be redirected
# to a browser window to sign in the first time
library(fabricQueryR)
# Sys.setenv(FABRICQUERYR_TENANT_ID = "...")
# Sys.setenv(FABRICQUERYR_CLIENT_ID = "...")
# SQL connection to Data Warehouse or Lakehouse --------------------------------
# Find your SQL connection string in Fabric by going to a Lakehouse or Data
# Warehouse item, then go to 'Settings' -> 'SQL analytics endpoint'
# Ensure that the account/principal you authenticate with has access to
# the workspace
# Get connection
con <- fabric_sql_connect(
server = "2gxz...4qiy.datawarehouse.fabric.microsoft.com"
)
# List databases
DBI::dbGetQuery(con, "SELECT name FROM sys.databases")
# List tables in the current database
DBI::dbGetQuery(
con,
"
SELECT TABLE_SCHEMA, TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
"
)
# Read 'Customers' table
df_sql <- DBI::dbReadTable(con, "Customers")
# Close connection
DBI::dbDisconnect(con)
# Table from Lakehouse via OneLake data access ---------------------------------
# Ensure that the account/principal you authenticate with has access via
# being part of the workspace, or via Lakehouse -> Manage OneLake data access
df_onelake <- fabric_onelake_read_delta_table(
table_path = "Customers",
workspace_name = "ExampleWorkspace",
lakehouse_name = "Lakehouse.Lakehouse",
)
# DAX query against Semantic Model ---------------------------------------------
# Find your connection string in Fabric by going to a 'Semantic Model' item,
# then go to 'File' -> 'Settings' -> 'Server settings'
# Ensure that the account you use to authenticate has access to the workspace,
# or that you have been granted 'Build' permissions on the dataset (via share)
df_dax <- fabric_pbi_dax_query(
connstr = paste0(
"Data Source=powerbi://api.powerbi.com/v1.0/myorg/",
"ExampleWorkspace;Initial Catalog=test data 1;"
),
dax = "EVALUATE TOPN(100000, 'Sheet1')"
)
# Livy API query to execute Spark code -----------------------------------------
# Find your connection string in Fabric by going to a 'Lakehouse' item,
# then go to 'Settings' -> 'Livy Endpoint' -> 'Session job connection string'
# Ensure that the account you use to authenticate has access to the workspace
# Run a Livy SparkR query
livy_sparkr_result <- fabric_livy_query(
livy_url = "https://api.fabric.microsoft.com/v1/workspaces/.../lakehouses/.../livyapi/..."
kind = "sparkr",
code = "print(1+2)"
)
# (See example in `?fabric_livy_query` for how to get data from the Lakehouse to R with Spark)
```
## Background
Microsoft Fabric is a new data platform from Microsoft which combines various data services,
including data warehousing, data lakes, and business intelligence.
It is built on top of Azure Data Services and integrates with Power BI for analytics and reporting.
Microsoft is actively promoting Fabric as the next-generation data platform for organizations using Microsoft Azure and Power BI.
As our organization started working with Microsoft Fabric, I found that that loading data into R from Fabric was not
yet straightforward, and took some effort to get working. To help others in the same situation,
I decided to share the functions I created to make this easier.