Retrieve files bigger than 50Mb in a SharePoint Online site using PowerShell PnP

Recently, I had a request to find every files in a specific SharePoint Online site that are 50Mb or bigger. Which I thought “this should be straight forward with conditions in my script“.
Well, turned out it wasn’t that straight forward after all. And I needed to be a bit “creative” 🙂

When using the SharePoint PowerShell PnP module, a few outputs are either not the ones you expect, or the type is different.

So I wanted to share my experience, findings, and solution!

 

Connect to SharePoint Online

First thing as usual, let’s connect to SPO using the Connect-PnPOnline cmdlet.
If you are using the Credential Manager, then you don’t need the Get-Credential cmdlet, nor the -Credential parameter. I always am, but if someone else is using the script, you never know.

 

$creds = Get-Credential
Connect-PnPOnline -Url 'https://<YOUR_TENANT_NAME>.sharepoint.com/sites/<YOUR_SITE>' -Credentials $creds

 

 

Loop through Libraries

The next step would be to gather the logical information needed.
So we want to go through each library in our site (current context), and as we’re interested in the file size, let’s focus on this column (field).

#Store in variable all the document libraries in the site
$DocLibrary = Get-PnPList | Where-Object {$_.BaseTemplate -eq 101}

foreach ($DocLib in $DocLibrary) {
    
    #Get list of all folders in the document library
    $AllItems = Get-PnPListItem -List $DocLib -Fields "SMTotalFileStreamSize"
       
}

To get the “SMTotalFileStreamSize” column, we need to look at the properties available for the Get-PnPListItem cmdlet.

So for this, let me give you an example:
I know that I have a library in this site, that’s called “docLib1“. Therefore, I’m going to run  Get-PnPListItem -List "docLib1" | Get-Member
At this stage, you will see a property called FieldValues. If we run (Get-PnPListItem -List "docLib1").FieldValues then we’ll be able to see ALL the columns values (fields) for the items.

ℹ️ For more information, refer to the Microsoft documentation for Get-PnPListItem (example 4)

 

Let’s go back to the script.
We know we don’t want to return all the files right? We only want those >50Mb (or even equal for that matter). So we need to insert a condition.
We also want to have some basic information about those files, like name, path, etc… + a little bit of artifacts with Write-Host

Not to forget to create a custom object for the desired information.

#Store in variable all the document libraries in the site
$DocLibrary = Get-PnPList | Where-Object { $_.BaseTemplate -eq 101 } 

foreach ($DocLib in $DocLibrary) {
    #Get list of all folders in the document libraries
    $AllItems = Get-PnPListItem -List $DocLib -Fields "SMTotalFileStreamSize"
    
    #Loop through each files/folders in the document libraries for files >50Mb
    foreach ($Item in $AllItems) {
        if ((($Item["SMTotalFileStreamSize"]) -ge 50000000)) {
            Write-Host "File found:" $Item["FileLeafRef"] -ForegroundColor Yellow
        
            [PSCustomObject]@{
                FileName         = $Item["FileLeafRef"] 
                FilePath         = $Item["FileRef"]
                SizeInMB         = ($Item["SMTotalFileStreamSize"] / 1MB).ToString("N")
                LastModifiedBy   = $Item.FieldValues.Editor.LookupValue
                EditorEmail      = $Item.FieldValues.Editor.Email
                LastModifiedDate = [DateTime]$Item["Modified"]
            }
        }#end of IF statement
    }
}

Sounds about right? Well, not really.
Because when we run this script, we face 2 issues 🤔

  1. The output is not filtering the data >50Mb like we requested, but returns all the files
  2. We also get the folders in the lot!

 

 

Let’s be creative!

For our first problem about the script not filtering, I had to dig a bit deeper into why my number was not working.
By checking what the variable $Item[“SMTotalFileStreamSize”] returned, I got the number 432 (for 432Kb). Looking deeper with $Item[“SMTotalFileStreamSize”].GetType().FullName, it was a string!

So I decided to cast my $Item into an integer with [int], and voilà! Filtering only the files and folders >50Mb.

 

Which brings us onto our second problem… Folders.
Tried to use the File property, but no. Still, the folders were showing up. Because I couldn’t find anything, I needed to be creative!
I noticed on the output I was after, that I needed a name with an extension (i.e.: .mp4, .mov, etc…)

So I added a second condition on my IF statement, with -like “*.*” and that did the trick!

Problem(s) solved, and full script below 🤓

#Store in variable all the document libraries in the site
$DocLibrary = Get-PnPList | Where-Object { $_.BaseTemplate -eq 101 } 

foreach ($DocLib in $DocLibrary) {
    #Get list of all folders in the document library
    $AllItems = Get-PnPListItem -List $DocLib -Fields "SMTotalFileStreamSize"
    
    #Loop through each files/folders in the document library for files >50Mb
    foreach ($Item in $AllItems) {
        if ((([int]$Item["SMTotalFileStreamSize"]) -ge 50000000) -and ($Item["FileLeafRef"] -like "*.*")) {
            Write-Host "File found:" $Item["FileLeafRef"] -ForegroundColor Yellow
        
            [PSCustomObject]@{
                FileName         = $Item["FileLeafRef"] 
                FilePath         = $Item["FileRef"]
                SizeInMB         = ($Item["SMTotalFileStreamSize"] / 1MB).ToString("N")
                LastModifiedBy   = $Item.FieldValues.Editor.LookupValue
                EditorEmail      = $Item.FieldValues.Editor.Email
                LastModifiedDate = [DateTime]$Item["Modified"]
            }
        }#end of IF statement
    }
}

 

Large files output

 

Hope it helps someone out there, and if you have a better way, I’m all ears!

 

 

 

Leave a Reply

%d bloggers like this: