Skip to main content

VBA in Excel to automate IE for crawling a web page


If you want to open a website and go through the results of a webpage using VBA 
you can achieve it by first including a reference to Microsoft HTML Object Library 
in your VBA editor.


The following snippet of code should be a good starting point of how you can achieve the same
Sub GoToWebSiteAndPlayAround()

Dim appIE As Object ' InternetExplorer.Application
Dim sURL As String

Application.ScreenUpdating = False
Set appIE = CreateObject("InternetExplorer.Application")

'URL with the search term 'Cancer' at Science
sURL = "http://www.google.com?q=vba" 'this URL to be replaced by your target web page

With appIE
    .navigate sURL
    ' uncomment the line below if you want to watch the code execute, or for debugging
    '.Visible = True
End With

' loop until the page finishes loading
Do While appIE.readyState <> 4
    DoEvents
Loop

'Get info from HTML by ID and Name
Dim outerDiv, innerSpan, requiredtext, HTMLDoc
Dim spanCollection, outerSpan
   Set HTMLDoc = appIE.document
   Set outerDiv = HTMLDoc.getElementById("outerDivClassName")
        Set spanCollection = outerDiv.getElementsByTagName("SPAN")
       
        For Each outerSpan In spanCollection
            If outerSpan.className = "outerSpanClassName" Then
                requiredtext = outerSpan.innerHTML
                Debug.Print requiredtext
            End If
        Next

Application.ScreenUpdating = True
appIE.Quit
End Sub



Comments

  1. Hi Hitesh... what if you want to determine the final URL for the page?

    ReplyDelete
  2. Hi Jon,

    In case the url is redirected you can wait for the required url /web page title to appear. Please have a look at this link for more help http://vba-corner.livejournal.com/4623.html. Let me know if this resolves your requirement

    ReplyDelete

Post a Comment

Popular posts from this blog

Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake acknowledgement.

I received the following error when trying to connect to a server's SQL Server. I was able to telnet the Server on the required TCP Port (1433) but SSMS could not connect to the SQL Server instance.  Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake acknowledgement. This could be because the pre-login handshake failed or the server was unable to respond back in time. The duration spent while attempting to connect to this server was – [Pre-Login] initialization=13472; handshake=14425; (Microsoft SQL Server, Error: -2) I eventually found that a Hyper-V Adapter was responsible for the issue, disabling which resolved the issue, have faced a similar issue with VirtualBox adapter too. Hope it helps.

A simple customization in MPOS (Blank Operation) with AX7

Hi All, I recently had to add a Blank Operation to Modern POS (MPOS) to open a url from MPOS. Blank Operation as you may already be aware enable you to extend Microsoft Dynamics Retail for POS by adding custom logic that can be triggered from the Retail POS Register buttons. The way to implement Blank Operations in MPOS  is different from Enterprise POS as MPOS is a modern app as compared to EPOS which is a windows forms based app. So lets explore a very simple customization i.e. we want to open a URL on triggering a button from MPOS. 1. We would need to start with AX to add a button to the layout of MPOS. If you do not want to disturb the standard layouts its better to copy one of the existing layouts and then modify it using the designer. Please note the designer only opens in Internet Explorer so it will save you time by not trying to open it in other browsers e.g Chrome 2. Next we need to add this layout to the Store where we intend to use it...