Today I got the chance to dive into Power BI for real. I had just finished running a ShareGate source analysis for a customer (fileshare → SharePoint migration). The standard reports are always a nice start, but the customer wanted more insights:
👉 Why do we have so many duplicate files?
👉 What’s the story behind all those folders and file names?
That’s when I decided to take the exported Excel and load it into Power BI. From there I started playing with DAX – and wow, it’s actually easier than I expected to get meaningful insights out of messy data.
One fun experiment: spotting “versioned” files by their names. You know the usual suspects — contract_v1.docx, contract_final.pdf, offerte_def.pptx…
Here’s how I approached it step by step:
1. Remove the file extension
A helper column to get a clean base name:
NameWithoutExt =
VAR Title = 'Data'[Title]
VAR Ext = 'Data'[File extension]
RETURN
IF ( NOT ISBLANK(Ext) && RIGHT(Title, LEN(Ext)) = Ext,
LEFT(Title, LEN(Title) - LEN(Ext)),
Title
)
2. Flag “version by name”
This checks for keywords (final, definitief, def, laatste, rev, versie) and simple version tags (_v1, -v2, v3 …).
VersionByNameFlag =
VAR N = LOWER ( 'Data'[NameWithoutExt] )
VAR HasKeyword =
CONTAINSSTRING(N," final") || CONTAINSSTRING(N," definitief") ||
CONTAINSSTRING(N," def") || CONTAINSSTRING(N," laatste") ||
CONTAINSSTRING(N," rev") || CONTAINSSTRING(N," versie")
VAR HasVNumber =
CONTAINSSTRING(N," v1") || CONTAINSSTRING(N," v2") || CONTAINSSTRING(N," v3") ||
CONTAINSSTRING(N," v4") || CONTAINSSTRING(N," v5") || CONTAINSSTRING(N," v6") ||
CONTAINSSTRING(N," v7") || CONTAINSSTRING(N," v8") || CONTAINSSTRING(N," v9")
RETURN IF ( 'Data'[Type] = "File" && ( HasKeyword || HasVNumber ), TRUE(), FALSE() )
3. Group and count
Now group files with the same base name that have multiple “versions”:
VersionByName Groups =
FILTER (
ADDCOLUMNS (
SUMMARIZE (
FILTER ( Data, Data[Type] = "File" && Data[VersionByNameFlag] = TRUE ),
Data[NameWithoutExt]
),
"File Count", CALCULATE ( COUNTROWS ( Data ) )
),
[File Count] > 1
)
How I showed it in Power BI
- Table 1 (Groups): shows the base name + file count.
- Table 2 (Instances): shows the actual file names and paths, filtered on VersionByNameFlag = TRUE.
- Turn on cross-filtering: click on a group → instantly see all its “versions” below.
🎯 The result? Within minutes we had a clear picture of how many “manual versions” were floating around in the file share. This makes it much easier to explain to the customer why moving to SharePoint with proper version history isn’t just nice-to-have — it’s a big win.
This was just my first Power BI experiment, but I can already see the potential. A simple Excel export turned into an interactive dashboard that tells a story. And the best part: the DAX code is not scary at all.