How to connect PowerBI to a Tabular API (and why you shouldn't)
Before we start, Iâll be basing this on the Clickup API (their API documentation is actually pretty good, you should give it a check!).
Clickup is a project management tool, similar to Monday.com and Azure DevOps.
Nothing in this document is all that specific to Clickup, so the contents should mostly hold true for whatever other REST API youâre looking at.
The data weâre looking at will contain things like tasks, assigned people, tags, due-dates, etc.
I say âwhy you shouldnâtâ because, as weâll learn, doing a direct connection like this is a series of problem-solving activities based around overcoming the inherent limitations of PowerBI by creating PowerQuery functions ourselves.
The basics: getting data
Letâs open up our query editor (Home->Transform Data) and set up a few helper parameters (variables).
Here we can create variables just like in many other programming languages.
Iâm going to create two parameters:
ClickUpTeamID
- This is the ID that represents your entire company/workspace. Everything lives in here (people, tasks, permissions, etc). You can see it whenever you use clickup, your url will start with something like https://app.clickup.com/12651678/...
ClickUpAPIKey
- If youâve used APIâs before, you know what this is. Itâs both a username and password, one single value that authenticates whoever uses it. As such, the actual value is hidden pretty religiously by administrators, as simply seeing it is enough for anyone to copy-paste and start using it. You can get your key from https://app.clickup.com/10631688/settings/apps.
Now that we have our two parameters, letâs build a quick query to grab the open tasks.
Looking at the documentation, I also included the include_closed
and subtasks
parameters to get those tasks as well.
let
Source = Json.Document(Web.Contents("https://api.clickup.com/api/v2",
[RelativePath="team/" & ClickUpTeamID & "/task",
Query=[include_closed="TRUE",
subtasks="TRUE"],
Headers=[Authorization=ClickUpAuthHeader]]))
in
Source
If we run this, we get back 100 tasks. Which is great, but not enough. If weâre building this to do analysis on our teamâs work habits/capacity, we need to see everything.
According to the API specification, tasks is a paginated GET. We need to include the page
parameter to specify which page to go to beyond the first.
So an example query that would get the next 100 results might be
let
Source = Json.Document(Web.Contents("https://api.clickup.com/api/v2",
[RelativePath="team/" & ClickUpTeamID & "/task",
Query=[include_closed="TRUE",
subtasks="TRUE",
page=2],
Headers=[Authorization=ClickUpAuthHeader]]))
in
Source
Issue #1: PowerBI Doesnât Support Pagination
In order to get all the results weâre looking for, weâre canât just do a standard query. We need to do one query per page, passing in a different page
value, keep going until the result is blank (no more tasks), and then combine all the values.
PowerBI doesnât have a built-in way to do this. Ideally, the UI would give us an option to treat this as a paginated API and treat it accordingly, but no such luck. Weâll have to code something ourselves.
Creating Functions
the let
and in
operators define a query in PowerQuery. let
defines the steps, and in
sets the final object to be returned (99% of the time, in
is just the name of your final step).
If we add an input line like (PageNo as number) =>
to the top of the code, above let
, we can turn our query into a function.
Iâm also going to edit our code to use that variable as the page to go get from the REST API.
(PageNo as number) =>
let
Source = Json.Document(Web.Contents("https://api.clickup.com/api/v2",
[RelativePath="team/" & ClickUpTeamID & "/task",
Query=[include_closed="TRUE",
subtasks="TRUE",
page=Number.ToText(PageNo)],
Headers=[Authorization=ClickUpAuthHeader]]))
in
Source
This changes the query into a UI that accepts a a parameter.
When we now use the UI to provide a value for PageNo
and invoke it, a new query will be created that invoked the function, e.g.
let
Source = ClickupTasks(3)
in
Source
which will get the third page of results.
With this in mind, Iâm going to rename our function-ized query to GetClickupTaskPage
, and create a new query that looks like this.
let
Records = List.Generate(()=>
[Source = ClickupGetTaskPages(0), Page=0],
each List.Count([Source][tasks]) > 0,
each [Source = ClickupGetTaskPages([Page]+1), Page=[Page] +1],
each [Source])
in
Records
This code will generate a list with an accompanying Page
variable, and while the Source
query-object isnât blank (still returning tasks), it will increment the variable and call Source
again with the next page, before finally giving us all the Source
âs together as a big list.
Each of these rows is a query we made, containing up to 100 rows each.
From there, I run the following steps to convert this list into a table, and expand the Record
objects down to the row-level so that theyâre not summarized objects anymore. (I did this through the UI, but hereâs the code it generates)
#"Converted to Table" = Table.FromList(Records, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"tasks"}, {"Column1.tasks"}),
#"Expanded Column1.tasks" = Table.ExpandListColumn(#"Expanded Column1", "Column1.tasks"),
#"Expanded Column1.tasks1" = Table.ExpandRecordColumn(#"Expanded Column1.tasks", "Column1.tasks", {"id", "custom_id", "name", "text_content", "description", "status", "orderindex", "date_created", "date_updated", "date_closed", "archived", "creator", "assignees", "watchers", "checklists", "tags", "parent", "priority", "due_date", "start_date", "points", "time_estimate", "custom_fields", "dependencies", "linked_tasks", "team_id", "url", "permission_level", "list", "project", "folder", "space"}, {"id", "custom_id", "name", "text_content", "description", "status", "orderindex", "date_created", "date_updated", "date_closed", "archived", "creator", "assignees", "watchers", "checklists", "tags", "parent", "priority", "due_date", "start_date", "points", "time_estimate", "custom_fields", "dependencies", "linked_tasks", "team_id", "url", "permission_level", "list", "project", "folder", "space"}),
And just like that, weâve overcome our first hurdle. We got our tasks.
Issue #2: PowerBI Doesnât Support UNIX Timestamps
Checking the data returned from the api, I immediately noticed something odd. All of the time columns (start date, due date, âŚ) have an integer value (like 1641782153786).
This is called a UNIX Timestamp, and represents the number of seconds since January 1st, 1970 (when time started existing).
There are some great tools for converting this to a human-readable time.
PowerBI is not one of them.
If you try to use the built-in data type conversion tools to turn this into a timestamp, it doesnât work.
Hereâs my workaround.
let
getDateTimeZone = (optional UNIX_Timestamp) =>
let
Source = if Value.Is(UNIX_Timestamp, type text) then Number.FromText(UNIX_Timestamp , "en-US") else UNIX_Timestamp,
step2 = try DateTimeZone.ToLocal(DateTime.AddZone(#datetime(1970, 1, 1, 0, 0, 0) + #duration(0, 0, 0, Source/1000),0)) otherwise null
in
step2
in
getDateTimeZone
Adding the following code to a new blank query will create a function.
This function takes in an integer (or text, and converts it to an integer), and tries to convert it to a DateTimeZone value by adding the unix timestamp (as a duration of seconds) to a 1970 timestamp. If the value is null, it returns null instead so that we donât get errors.
The good news is that this works! The bad news is that this is more complex than just changing the field type. For each column we want to fix, weâll have to add a query step that adds a new column that invokes this function on the original column, and then delete the original column.
= Table.AddColumn(#"..Previous Step..", "start_date", each getDateTimeZone([original_start_date]))
= Table.RemoveColumns(#"Invoked Custom Function1",{"original_start_date"})
Creating context-specific queries
Once I started adding other queries to the api, I noticed a problem. For a few of the things I wanted to query (like tags, folders, and lists) there was no way to query the entire environment for them; I could only see them within the context of a larger bucket like space
! In Clickup terms, a space
is a bucket inside the environment that you can have any number of lists/folders in. If you had a business, you might have a space for accounting, sales, development, etc. that you have many folders and lists inside.
So, I canât just âget all the foldersâ. First, I need to get all the spaces (a space just being a top-level bucket), and then run a query like this for each.
let
Source = Json.Document(Web.Contents("https://api.clickup.com/api/v2",
[RelativePath="space/" & ~space id~ & "/folder",
Query=[],
Headers=[Authorization=ClickUpAuthHeader]]))
in
Source
We can combine our previous two concepts to do exactly this.
First, Iâll take the code above and convert it into a function.
(spaceid as text) as table =>
let
Source = Json.Document(Web.Contents("https://api.clickup.com/api/v2",
[RelativePath="space/" & spaceid & "/folder",
Query=[
],
Headers=[Authorization=ClickUpAuthHeader]]))
in
Source
Then, Iâll run a query to get my list of spaces, and add a step to invoke this function (just like I did with the timestamps) on every row. This returns a table in each row with a list of folders in that rowâs space. I can expand the values (like we did for tasks) to get my folders out and visible in a row-level.
let
Source = Json.Document(Web.Contents(ClickUpBaseURL,
[RelativePath="team/" & ClickUpTeamID & "/space",
Query=[archived="FALSE"
],
Headers=[Authorization=ClickUpAuthHeader]])),
spaces = Source[spaces],
#"Converted to Table" = Table.FromList(spaces, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"id", "name", "private", "statuses", "multiple_assignees", "features", "archived"}, {"space_id", "name", "private", "statuses", "multiple_assignees", "features", "archived"}),
#"Added Custom" = Table.AddColumn(#"Expanded Column1", "Custom", each ClickupGetFolders([space_id])),
#"Removed Errors" = Table.RemoveRowsWithErrors(#"Added Custom", {"Custom"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Errors", "Custom", {"Folder.id", "Folder.name", "Folder.orderindex", "Folder.override_statuses", "Folder.hidden", "Folder.space", "Folder.task_count", "Folder.archived", "Folder.statuses", "Folder.lists", "Folder.permission_level"}, {"Folder.id", "Folder.name", "Folder.orderindex", "Folder.override_statuses", "Folder.hidden", "Folder.space", "Folder.task_count", "Folder.archived", "Folder.statuses", "Folder.lists", "Folder.permission_level"}),
in
#"Expanded Custom"
This dynamic query-path allows us to get everything we need.
howeverâŚ
Issue #3: PowerBI.com Doesnât Support Refreshing Non-Static Query Paths
Remember how we had to use parameters and row-executed functions to overcome those pagination issues? And how we had to dynamically change to query path to get the right space to look at?
Well, the online service doesnât like that very much, and it wonât let us refresh the data online once we publish this dashboard.
Whatâs incredible to me is how this is such a random limitation.
You can connect to your data on desktop, manually refresh on desktop just fine, publish to the service and see the data online, but you canât schedule nor trigger a refresh on the service.
This means people can access the report from the website, but in order for that data to be up to date, a workspace admin would need to open the desktop file, hit Refresh, and re-publish the report. Even in a once-a-day cadence, this is an absurdly manual process.
After much googling, I was able to find someone else dealing with this problem, and a good blog post by Chris Webb: Web.Contents(), M Functions And Dataset Refresh Errors In Power BI
This query refreshes with no problems in Power BI Desktop. However, when you publish a report that uses this code to PowerBI.com and try to refresh the dataset, youâll see that refresh fails and returns a rather unhelpful error message:
Unable to refresh the model because it references an unsupported data source.
So hereâs the problem, as stated by Chris Webb
When a published dataset is refreshed, Power BI does some static analysis on the code to determine what the data sources for the dataset are and whether the supplied credentials are correct. Unfortunately in some cases, such as when the definition of a data source depends on the parameters from a custom M function, that static analysis fails and therefore the dataset does not refresh.
Chris Webb is able to solve this for himself by setting a static query URL and adding parameters to the end as a data payload.
//before
Web.Contents(
"https://data.gov.uk/api/3/action/package_search?q="&Term
)
//after
Web.Contents(
"https://data.gov.uk/api/3/action/package_search",
[Query=[q=Term]]
)
But this is what we were already doing for things like âsubtasks=TRUEâ. We werenât concatenating that into the URL, we were passing it as a parameter.
What we canât set as a static part of the the relative path. So, earlier, when we needed to iterate through and run a query for each space to get a list of folders inside each one?
Source = Json.Document(
Web.Contents("https://api.clickup.com/api/v2",
[RelativePath="space/" & spaceid & "/list",
...
That space id isnât a parameter, itâs part of the actual path weâre connecting to.
As far as Iâve found, thereâs no workaround for this.
If youâre still determined at this point, hereâs what I suggest:
You could hardcode the list of spaces/folders to query as a set of static URL strings.
The report itself can now be refreshed on the service, but you will still have to occasionally go in and manually update those strings yourself. If people add new folders/spaces/etc, they wonât automatically get queried by the report until you edit the file to do so.
Alternatively, since you have all the tasks, you could get your lists of users/folders/lists/tags by creating summary tables that reference your Tasks table and grab the distinct values out of it. The downside of this one is that, as far as PowerBI is now concerned, anything that isnât being used by a task doesnât exist. You would not see any empty lists, inactive users, etc. You could imagine this might make your dashboard especially volatile if you were only pulling open tasks, as entire folders and people blink out of existence after closing their tasks.
Maybe use an ETL server instead.
Because of all these issues, I decided to use an online ETL server to first stage the data from the API.
Setting up a proper data pipeline offers a variety of benefits, not least of which is that any transformations you make can be accessible as a single âsource of truthâ, instead of being siloâd in the report. All of the workarounds we had to do to get the API working can be dropped in favor of pulling in standardized tables from our server.
I went with https://www.keboola.com/ as their free tier was generous enough that I was able to accomplish this without cost.
For posterity, hereâ the code I used to pull the Tasks
as an example of their configuration structure.
{
"parameters": {
"api": {
"baseUrl": "https://api.clickup.com/api/v2/team/[redacted]/",
"http": {
"headers": {
"Authorization": "pk_[redacted]",
"Content-Type": "application/json;charset=UTF-8"
},
"defaultOptions": {
"params": {
"include_closed": true,
"subtasks": true,
"assignees%5B%5D": "Sam Cochrane"
}
}
},
"pagination": {
"method": "pagenum",
"firstPage": 0
}
},
"config": {
"debug": true,
"outputBucket": "bk_clickup",
"jobs": [
{
"endpoint": "task",
"dataField": "tasks",
"responseFilter": [
"custom_fields"
]
}
]
}
}
}