VSTO & .NET & Excel

March 19, 2012

Remove Duplicates in VB

Filed under: .NET & Excel, COM Add-ins, Excel, VSTO & Excel — Dennis M Wallentin @ 3:42 pm

I have some articles in the pipeline that shows how to achieve various tasks in MS Excel with VB.NET. It’s no rocket science but practical solutions for frequent tasks. The first article is this that covers how to remove duplicates from tables.

To following screen shot shows the table before removing any duplicates:

In the solution I will use columns 1,3,4 and 5 in the list to filter out the duplicates and than remove them.

The following code shows the VB solution. The method RemoveDuplicate expect that the column’s indexes are temporarily stored in an object array (Similar to worksheets’ collections).

Imports Excel = Microsoft.Office.Interop.Excel
Imports System.Runtime.InteropServices

Public Class Form1

Private Sub C1Button2_Click(sender As System.Object, e As System.EventArgs) _
Handles C1Button2.Click

'Excel's objects.
Dim xlApp As New Excel.Application
Dim xlWBook As Excel.Workbook = xlApp.Workbooks.Open( _
Filename:="C:\Users\Dennis Wallentin\Documents\RemDupes.xlsx")

Dim xlSheet As Excel.Worksheet = CType(xlWBook.Worksheets(1),  _
Excel.Worksheet)
Dim xlRange As Excel.Range = xlSheet.Range("A1:E7")

'The Object Array that will hold the columns numbers when executing
'the instruction.
Dim oColumnsToUse(0) As Object

'The default value for the parameter Header is XlNo.
Dim Header As Microsoft.Office.Interop.Excel.XlYesNoGuess

'Populate the Object Array with column's numbers.
oColumnsToUse.SetValue(1, 0) 'Col 1
oColumnsToUse.SetValue(3, 0) 'Col 3
oColumnsToUse.SetValue(4, 0) 'Col 4
oColumnsToUse.SetValue(5, 0) 'Col 5

'Removes all duplicates.
xlRange.RemoveDuplicates(Columns:=oColumnsToUse, Header:=Header)

'Save and close the workbook.
xlWBook.Close(SaveChanges:=vbYes)

'Cleaning up.
If Not xlApp Is Nothing Then
GC.Collect()
GC.WaitForPendingFinalizers()
GC.Collect()
GC.WaitForPendingFinalizers()
xlApp.Quit()
End If

Marshal.FinalReleaseComObject(xlRange)
Marshal.FinalReleaseComObject(xlSheet)
Marshal.FinalReleaseComObject(xlWBook)
Marshal.FinalReleaseComObject(xlApp)

End Sub

After executing the above code we will have a list that looks like the following:

Kind regards,
Dennis

Advertisement

8 Comments »

  1. So is the method “RemoveDuplicates” a VSTO extension or did you create your own extension?

    Also, VSTO still doesn’t let you target multiple Excel versions, right?

    Comment by Jon — March 19, 2012 @ 7:37 pm

    • Jon,
      I find Your point about VSTO unrelated to this post. As You can see I refer to VB.NET. If You study the code You see it’s an automation of Excel.

      For some unknown reason to me too many Excel developers have not take the time to investigate the .NET platform any further.

      I’m pretty fed up with the trash talk about VSTO. .NET offer so much more then VSTO that too many Excel developers cannot see the forest due to some trees.

      Kind regards,
      Dennis

      Comment by Dennis Wallentin — March 20, 2012 @ 12:43 am

      • Sorry, Dennis, didn’t mean to make you upset. I actually prefer to use .NET over VBA any day too. I was just wondering if the “RemoveDuplicates” was a feature of VSTO or new to the VBA language also in 2007 or 2010 and above. I think VSTO would be awesome if only you could use it on multiple platforms, so I was just wondering if it was updated yet or not.

        Right now I am just using an Excel library derived from the EXE file so I can write for Excel 2003 and above. I use it with Excel DNA. But my preference would be to use only VSTO since I do have professional license for Visual Studio, it would be nice to actually use VSTO instead using a work around. I’m sure they’ve added a lot of cool features with VSTO also, that’s another reason I was asking, I was wondering what I am missing by not using VSTO.

        No hard feelings I hope. Honest question with no malevolent intentions.

        Comment by Jon — March 20, 2012 @ 2:15 am

      • Jon,
        The RemoveDuplicate method is new from Excel 2007.

        On the .NET platform (not VSTO) we can create managed COM add-ins and Automation add-ins. Not only that but also managed XLLs. We can compile our own Interop Assemblies (IA) and therefore we can create solutions that target Excel 2000 – 2010. Of course, new stuff introduced in later versions cannot be used. All of this can be done without VSTO on the .NET platform.

        Let us see what a VSTO Excel add-in is:
        Uses PIAs (Microsoft Public Interop Assemblies) which we also can use in the above non-VSTO solutions.
        It uses a shimmed unmanaged DLL, vstoloader, to create the AppDomain in Excel’s memory where it also load the VSTO DLL to. It’s the same technically solution that we use in the above solutions. The only difference is that we need to shim the unmanaged DLL.
        All Interop is well hidden, given an impression that it’s different from managed COM add-ins. But under the hood the same technically approach is used.

        The above indicates that we don’t need VSTO at all for creating Add-ins.

        The other option with VSTO is to create workbooks where we hook on a managed solution. If we detach it then the workbook is just an ordinary workbook. Of course, any VSTO technology stuff used will be temporarily removed. In my experience it’s only a few situations where it can help us.

        From a practical point of view; we don’t need VSTO and instead of spending Your money for something that bring little or no values use them to buy the RAD Add-in Express.NET for Office.

        Recently I read a thread where some high profiled MVPs discussed VSTO. I was surprised, very surprised to learn the lack of knowledge about .NET and VSTO.

        The above ends all the discussion about VSTO in this thread.

        Kind regards,
        Dennis

        Comment by Dennis Wallentin — March 20, 2012 @ 7:40 pm

  2. Hi Dennis,

    I tried using the code above, but for some reason it does not fully work. It deletes rows that are not duplicates as well.

    My Excel sheet contains two columns where redundancies are to be searched for (please see below)

    A B
    188837 ARB
    188837 BAH
    188837 BAH
    188837 LIT
    188836 LAV
    188836 LAV
    188837 LAV
    188837 HEB

    I used the following code to remove duplicates, but it is removing the 2nd to last row(188837) as well.

    Imports Excel = Microsoft.Office.Interop.Excel
    Imports System.Runtime.InteropServices

    Public Class Form1
    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    Dim xapp As Excel.Application = New Excel.Application
    Dim xbook As Excel.Workbook
    Dim xsheet As Excel.Worksheet
    Dim xrange As Excel.Range
    Dim lastrow As Long
    Dim col(0) As Object

    ‘displays WB
    xapp.Visible = True

    ‘Opens WB
    xbook = xapp.Workbooks.Open(“path\test.xlsx”)

    xsheet = xbook.Sheets(1)

    col.SetValue(1, 0)
    col.SetValue(2, 0)

    With xsheet
    lastrow = .Range(“A” & .Rows.Count).End(Excel.XlDirection.xlUp).Row
    End With

    xsheet.Activate()
    xrange = xsheet.Range(“A1:B” & lastrow)
    xrange.RemoveDuplicates(Columns:=col, Header:=Excel.XlYesNoGuess.xlGuess)

    ‘cleaning up
    If Not xapp Is Nothing Then
    GC.Collect()
    GC.WaitForPendingFinalizers()
    GC.Collect()
    GC.WaitForPendingFinalizers()
    xapp.Quit()
    End If

    Marshal.FinalReleaseComObject(xrange)
    Marshal.FinalReleaseComObject(xsheet)
    Marshal.FinalReleaseComObject(xbook)
    Marshal.FinalReleaseComObject(xapp)

    End
    End Sub
    End Class

    Your help is greatly appreciated.

    Comment by Prashanth Karl — August 23, 2013 @ 6:04 pm

    • Hi,
      OK, I will take a closer look on the issue. Please let me know which version of the Visual Studio You use.

      Kind regards,
      Dennis

      Comment by Dennis M Wallentin — August 23, 2013 @ 7:19 pm

  3. Hi Dennis!
    Been a long time. Finally had the chance to come back to VB. Please ignore the above question since it was a silly error in logic. Your code works beautifully!
    Thanks a mil.
    Karl

    Comment by Prashanth Karl — January 3, 2014 @ 1:38 pm

    • Hello,
      Welcome back 🙂
      As You can see I nowadays have more interest in other development tools then Microsoft’s.
      But now I also will be back with VB. But only for the fun!

      Take care,
      Dennis

      Comment by Dennis M Wallentin — January 4, 2014 @ 5:11 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: