I have some articles in the pipeline that shows how to achieve various tasks in MS Excel with VB.NET. It’s no rocket science but practical solutions for frequent tasks. The first article is this that covers how to remove duplicates from tables.
To following screen shot shows the table before removing any duplicates:
In the solution I will use columns 1,3,4 and 5 in the list to filter out the duplicates and than remove them.
The following code shows the VB solution. The method RemoveDuplicate expect that the column’s indexes are temporarily stored in an object array (Similar to worksheets’ collections).
Imports Excel = Microsoft.Office.Interop.Excel Imports System.Runtime.InteropServices Public Class Form1 Private Sub C1Button2_Click(sender As System.Object, e As System.EventArgs) _ Handles C1Button2.Click 'Excel's objects. Dim xlApp As New Excel.Application Dim xlWBook As Excel.Workbook = xlApp.Workbooks.Open( _ Filename:="C:\Users\Dennis Wallentin\Documents\RemDupes.xlsx") Dim xlSheet As Excel.Worksheet = CType(xlWBook.Worksheets(1), _ Excel.Worksheet) Dim xlRange As Excel.Range = xlSheet.Range("A1:E7") 'The Object Array that will hold the columns numbers when executing 'the instruction. Dim oColumnsToUse(0) As Object 'The default value for the parameter Header is XlNo. Dim Header As Microsoft.Office.Interop.Excel.XlYesNoGuess 'Populate the Object Array with column's numbers. oColumnsToUse.SetValue(1, 0) 'Col 1 oColumnsToUse.SetValue(3, 0) 'Col 3 oColumnsToUse.SetValue(4, 0) 'Col 4 oColumnsToUse.SetValue(5, 0) 'Col 5 'Removes all duplicates. xlRange.RemoveDuplicates(Columns:=oColumnsToUse, Header:=Header) 'Save and close the workbook. xlWBook.Close(SaveChanges:=vbYes) 'Cleaning up. If Not xlApp Is Nothing Then GC.Collect() GC.WaitForPendingFinalizers() GC.Collect() GC.WaitForPendingFinalizers() xlApp.Quit() End If Marshal.FinalReleaseComObject(xlRange) Marshal.FinalReleaseComObject(xlSheet) Marshal.FinalReleaseComObject(xlWBook) Marshal.FinalReleaseComObject(xlApp) End Sub
After executing the above code we will have a list that looks like the following:
Kind regards,
Dennis
So is the method “RemoveDuplicates” a VSTO extension or did you create your own extension?
Also, VSTO still doesn’t let you target multiple Excel versions, right?
Comment by Jon — March 19, 2012 @ 7:37 pm
Jon,
I find Your point about VSTO unrelated to this post. As You can see I refer to VB.NET. If You study the code You see it’s an automation of Excel.
For some unknown reason to me too many Excel developers have not take the time to investigate the .NET platform any further.
I’m pretty fed up with the trash talk about VSTO. .NET offer so much more then VSTO that too many Excel developers cannot see the forest due to some trees.
Kind regards,
Dennis
Comment by Dennis Wallentin — March 20, 2012 @ 12:43 am
Sorry, Dennis, didn’t mean to make you upset. I actually prefer to use .NET over VBA any day too. I was just wondering if the “RemoveDuplicates” was a feature of VSTO or new to the VBA language also in 2007 or 2010 and above. I think VSTO would be awesome if only you could use it on multiple platforms, so I was just wondering if it was updated yet or not.
Right now I am just using an Excel library derived from the EXE file so I can write for Excel 2003 and above. I use it with Excel DNA. But my preference would be to use only VSTO since I do have professional license for Visual Studio, it would be nice to actually use VSTO instead using a work around. I’m sure they’ve added a lot of cool features with VSTO also, that’s another reason I was asking, I was wondering what I am missing by not using VSTO.
No hard feelings I hope. Honest question with no malevolent intentions.
Comment by Jon — March 20, 2012 @ 2:15 am
Jon,
The RemoveDuplicate method is new from Excel 2007.
On the .NET platform (not VSTO) we can create managed COM add-ins and Automation add-ins. Not only that but also managed XLLs. We can compile our own Interop Assemblies (IA) and therefore we can create solutions that target Excel 2000 – 2010. Of course, new stuff introduced in later versions cannot be used. All of this can be done without VSTO on the .NET platform.
Let us see what a VSTO Excel add-in is:
Uses PIAs (Microsoft Public Interop Assemblies) which we also can use in the above non-VSTO solutions.
It uses a shimmed unmanaged DLL, vstoloader, to create the AppDomain in Excel’s memory where it also load the VSTO DLL to. It’s the same technically solution that we use in the above solutions. The only difference is that we need to shim the unmanaged DLL.
All Interop is well hidden, given an impression that it’s different from managed COM add-ins. But under the hood the same technically approach is used.
The above indicates that we don’t need VSTO at all for creating Add-ins.
The other option with VSTO is to create workbooks where we hook on a managed solution. If we detach it then the workbook is just an ordinary workbook. Of course, any VSTO technology stuff used will be temporarily removed. In my experience it’s only a few situations where it can help us.
From a practical point of view; we don’t need VSTO and instead of spending Your money for something that bring little or no values use them to buy the RAD Add-in Express.NET for Office.
Recently I read a thread where some high profiled MVPs discussed VSTO. I was surprised, very surprised to learn the lack of knowledge about .NET and VSTO.
The above ends all the discussion about VSTO in this thread.
Kind regards,
Dennis
Comment by Dennis Wallentin — March 20, 2012 @ 7:40 pm
Hi Dennis,
I tried using the code above, but for some reason it does not fully work. It deletes rows that are not duplicates as well.
My Excel sheet contains two columns where redundancies are to be searched for (please see below)
A B
188837 ARB
188837 BAH
188837 BAH
188837 LIT
188836 LAV
188836 LAV
188837 LAV
188837 HEB
I used the following code to remove duplicates, but it is removing the 2nd to last row(188837) as well.
Imports Excel = Microsoft.Office.Interop.Excel
Imports System.Runtime.InteropServices
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim xapp As Excel.Application = New Excel.Application
Dim xbook As Excel.Workbook
Dim xsheet As Excel.Worksheet
Dim xrange As Excel.Range
Dim lastrow As Long
Dim col(0) As Object
‘displays WB
xapp.Visible = True
‘Opens WB
xbook = xapp.Workbooks.Open(“path\test.xlsx”)
xsheet = xbook.Sheets(1)
col.SetValue(1, 0)
col.SetValue(2, 0)
With xsheet
lastrow = .Range(“A” & .Rows.Count).End(Excel.XlDirection.xlUp).Row
End With
xsheet.Activate()
xrange = xsheet.Range(“A1:B” & lastrow)
xrange.RemoveDuplicates(Columns:=col, Header:=Excel.XlYesNoGuess.xlGuess)
‘cleaning up
If Not xapp Is Nothing Then
GC.Collect()
GC.WaitForPendingFinalizers()
GC.Collect()
GC.WaitForPendingFinalizers()
xapp.Quit()
End If
Marshal.FinalReleaseComObject(xrange)
Marshal.FinalReleaseComObject(xsheet)
Marshal.FinalReleaseComObject(xbook)
Marshal.FinalReleaseComObject(xapp)
End
End Sub
End Class
Your help is greatly appreciated.
Comment by Prashanth Karl — August 23, 2013 @ 6:04 pm
Hi,
OK, I will take a closer look on the issue. Please let me know which version of the Visual Studio You use.
Kind regards,
Dennis
Comment by Dennis M Wallentin — August 23, 2013 @ 7:19 pm
Hi Dennis!
Been a long time. Finally had the chance to come back to VB. Please ignore the above question since it was a silly error in logic. Your code works beautifully!
Thanks a mil.
Karl
Comment by Prashanth Karl — January 3, 2014 @ 1:38 pm
Hello,
Welcome back 🙂
As You can see I nowadays have more interest in other development tools then Microsoft’s.
But now I also will be back with VB. But only for the fun!
Take care,
Dennis
Comment by Dennis M Wallentin — January 4, 2014 @ 5:11 pm