Google Sheets has a built-in function called ImportXML which will import pieces of structured data from a webpage to a Google Sheet. You can use this function to import publicly available information (unfortunately you can't use it with websites that you have to authenticate to or prevent scrapping).
We'll show you how this function can be used to aggregate Twitter followers from different Twitter accounts and then display that information in Geckoboard through our Spreadsheets integration.
Syntax of the ImportXML function
The syntax of the ImportXML function is:
- The URL of the page to examine, including protocol (e.g. http://). The value for URL must either be enclosed in quotation marks or be a reference to a cell containing the appropriate text.
- The XPath query to run on the data given at the URL. Each result from the XPath query is placed in its own row of the spreadsheet.
How does the ImportXML function work?
We recommend working through the first 4 chapters of this blog post as it will give you a really good basic knowledge of reading source code and interpreting HTML, XPath and the ImportXML function.
Once you've got your head around these things, come back to this article and we'll explain how you can use your newfound knowledge to the benefit of your Geckoboard dashboard.
Step1: Importing Twitter Follower Data
In a web browser:
- Navigate to our Twitter feed http://twitter.com/geckoboard (we’re using Safari in our example).
- Right-click on the page and click Show Page Source.
- We’re looking for the Follower count in amongst all the code so click inside the code that appears and press Command + F (or Control + F) and type in "followers".
- The screenshot below shows the section of the page source that forms the image above e.g. Tweets, Following, Followers, Favorites.
- Our goal is to find something in the source code that uniquely identifies the number of followers we have. On closer inspection we can see that, inside the <a> tag, is an element called data-nav=”followers”. This element is our key to importing just the piece of data that states how many followers we have.
To pull out this data we’ll use the XPath query //a[@data-nav='followers']. This finds all the <a> tags in the source code and imports any with a data-nav=”followers” attribute. As we’ve discovered, this is only used once in the source so will only import one result into our Google Sheet when we write our ImportXML function.
In Google Sheets:
- We’ve written Twitter in Cell A1 simply as a heading to explain what data will be found in the cells next to it.
- In Cell B1, we’ll type our ImportXML function. This is made up of the URL of the page to examine, http://twitter.com/geckoboard, and the XPath query that we want to run on that page //a[@data-nav='followers']
- The word Followers is imported into cell B1 and the number of followers into cell C1. We haven’t done anything special to make this happen, this is just how the data has imported. However, it’s really fortunate as it leaves the number in its own cell which will make it easy to do calculations on.
Step 2: Use Spreadsheets to visualize the data in Geckoboard
With the number of Twitter followers data in your Google Sheet, create a new Spreadsheets widget in Geckoboard to display this information on the dashboard.