Thursday, September 8, 2011

VBA in Excel to automate IE for crawling a web page

If you want to open a website and go through the results of a webpage using VBA 
you can achieve it by first including a reference to Microsoft HTML Object Library 
in your VBA editor.

The following snippet of code should be a good starting point of how you can achieve the same
Sub GoToWebSiteAndPlayAround()

Dim appIE As Object ' InternetExplorer.Application
Dim sURL As String

Application.ScreenUpdating = False
Set appIE = CreateObject("InternetExplorer.Application")

'URL with the search term 'Cancer' at Science
sURL = "" 'this URL to be replaced by your target web page

With appIE
    .navigate sURL
    ' uncomment the line below if you want to watch the code execute, or for debugging
    '.Visible = True
End With

' loop until the page finishes loading
Do While appIE.readyState <> 4

'Get info from HTML by ID and Name
Dim outerDiv, innerSpan, requiredtext, HTMLDoc
Dim spanCollection, outerSpan
   Set HTMLDoc = appIE.document
   Set outerDiv = HTMLDoc.getElementById("outerDivClassName")
        Set spanCollection = outerDiv.getElementsByTagName("SPAN")
        For Each outerSpan In spanCollection
            If outerSpan.className = "outerSpanClassName" Then
                requiredtext = outerSpan.innerHTML
                Debug.Print requiredtext
            End If

Application.ScreenUpdating = True
End Sub


  1. Hi Hitesh... what if you want to determine the final URL for the page?

  2. Hi Jon,

    In case the url is redirected you can wait for the required url /web page title to appear. Please have a look at this link for more help Let me know if this resolves your requirement