Not all web-scraping has to be challenging (such as a massive set of python scripts). Some of it can be done just by going to a URL with a GET request. (Some of the best APIs respond to the GET HTTP verb)
The challenge is comparing the data on all the pages at once. It would be easier if all the information was in something like Excel - and it can be! Most pages use HTML, which is easily understood using the Document Object Model (DOM). For these cases that don't need security, it is very easy to load data using excel macros.
The first step is to save your file as an XLSM (macro enabled). Then open the visual basic editor (alt+f11) and create a new module. Inside the module, you can use the following code:
Option Explicit ' forces you to declare all the variables
Function GetValueFromURL(url As String, id As String) As Variant
' to refer to the running copy of Internet Explorer
Dim ie As InternetExplorer
' to refer to the HTML document returned
Dim html As HTMLDocument
' open Internet Explorer in memory, and go to website
Set ie = New InternetExplorer
' turn off visibility so the IE window does not open
ie.Visible = False
ie.navigate url
' Wait until IE is done loading page
Do While ie.READYSTATE <> READYSTATE_COMPLETE
Application.StatusBar = "Trying to go to " & url & " and get the value of " & id & " ..."
DoEvents
Loop
' get the HTML document returned
Set html = ie.document
' close down IE
Set ie = Nothing
' and reset status bar
Application.StatusBar = ""
' In VBA, set the result equal to the method name to return the value
GetValueFromURL = html.getElementById(id).innerText
End Function
Nice! Now I can do something like get the date of every blog post, and make a histogram to see when I was busiest. Well, that's just one example.
So, the above example will work with HTML, but what about loading things like JSON that are NOT HTML documents. We don't want to parse anything, just get a string
Function GetWebText(url As String) As String
Dim objHttp As Object
Set objHttp = CreateObject("MSXML2.ServerXMLHTTP")
' use the HTTP GET verb (POST is recommended to send data)
' when calling the url
' asynchronously = False
objHttp.Open "GET", url, False
' since this is GET, we don't send data to the server
objHttp.send ""
GetWebText = objHttp.responseText
End Function
This method is a little simplified, but you don't have access to the DOM. However, changing the GET verb to POST, one can send data to a server using this macro.