Google Sheets has a built-in function called ImportXML which will import pieces of structured data from a webpage to a Google Sheet. You can use this function to import almost any publicly available information (unfortunately you can't use it with websites that you have to authenticate to).
Here's a question we've heard from customers in the past:
"I manage multiple Twitter accounts but they’re all for different parts of the same company. Instead of displaying a individual Twitter Follower Count widget for each of my different accounts, I want to show the total number of all our Twitter followers in one widget”.
The answer is to use the Google Sheets function, ImportXML.
In this article we’ll work through a similar scenario and explain how to use the ImportXML function of Google Sheets to import how many likes our Facebook page has and how many followers our Twitter page has. This function doesn't update on its own by default so we've written a script that we'll explain how to adapt to ensure your data is kept up to date.
- Syntax of the Function
- How does the Function Work?
- Updating your Data Automatically
Syntax of the Function
Here's the syntax of the ImportXML function:
URL- The URL of the page to examine, including protocol (e.g. http://). The value for URL must either be enclosed in quotation marks or be a reference to a cell containing the appropriate text.
query- is the XPath query to run on the data given at the URL. Each result from the XPath query is placed in its own row of the spreadsheet.
How does the Function Work?
Others have done a really great job of explaining how this function works so we'll leave you in their capable hands.
We recommend working through the first 4 chapters of this blog post as it will give you a really good basic knowledge of reading source code and interpreting HTML, XPath and the ImportXML function.
Note: We are aware that there are a few screenshots missing in Chapter 4 of this blog post, but this shouldn't affect the overall learning.
Once you've got your head around these things, come back to this article and we'll explain how you can use your newfound knowledge to the benefit of your Geckoboard dashboard.
Example 1: Importing Twitter Follower Data
In a web browser:
- Navigate to our Twitter feed http://twitter.com/geckoboard (we’re using Safari in our example)
- Right-click on the page and click Show Page Source.
- We’re looking for the Follower count in amongst all the code so click inside the code that appears and press Command + F (or Control + F) and type in "followers".
- The screenshot below shows the section of the page source that forms the image above e.g. Tweets, Following, Followers, Favorites.
- Our goal is to find something in the source code that uniquely identifies the number of followers we have. On closer inspection we can see that, inside the <a> tag, is an element called data-nav=”followers”. This element is our key to importing just the piece of data that states how many followers we have. To pull out this data we’ll use the XPath query //a[@data-nav='followers']. This finds all the <a> tags in the source code and imports any with a data-nav=”followers” attribute. As we’ve discovered, this is only used once in the source so will only import one result into our Google Sheet when we write our ImportXML function.
In Google Sheets:
- We’ve written Twitter in Cell A1 simply as a heading to explain what data will be found in the cells next to it
- In Cell B1, we’ll type our ImportXML function. This is made up of the URL of the page to examine, http://twitter.com/geckoboard, and the XPath query that we want to run on that page //a[@data-nav='followers']
- The word Followers is imported into cell B1 and the number of followers into cell C1. We haven’t done anything special to make this happen, this is just how the data has imported. However, it’s really fortunate as it leaves the number in its own cell which will make it easy to do calculations on.
Example 2: Importing Facebook Page Like Data
In a web browser:
- Navigate to our Facebook page:https://www.facebook.com/Geckoboard
- Right-click on the page and click Show Page Source
- This time we’re looking for the number of likes in amongst all the code so click inside the code that appears and press Command + F (or Control + F) and type in the number of likes we have e.g. ‘1,275. When we did this it brought back just 2 results (whereas searching for the word 'likes' brought back 61 results)
- Again, the goal is to find a unique identifier that will import just the number of likes we have on our Facebook page. Through some trial and error we found that, inside the <span> tag, was an element called id="PagesLikesCountDOMID" and this just pulled out the number of likes our Facebook has. The XPath query will look like //span[@id='PagesLikesCountDOMID'] which will find all the <span> tags in the source code and imports any with the id="PagesLikesCountDOMID" attribute.
In Google Sheets:
- We’ve written Facebook in Cell A2 as a heading to explain what data will be found in the cells next to it
- In Cell B2, we’ll type our ImportXML function. This is made up of the URL of the page to examine, https://www.facebook.com/Geckoboard and the XPath query that we want to run on that page, //span[@id='PagesLikesCountDOMID']
Updating your Data Automatically
Unfortunately, the Google Sheets ImportXML function won't automatically update from the source, even if the data on the source webpage is changed. This isn't particularly helpful so we've written a script that you can adapt to update your imported data every minute, even when the Google Sheet is closed.
Please note that these next steps do not follow on from the examples above and you can disregard these examples now.
Please open a new Google Sheet to work through these next steps.
Step 1: Prepare your Google Sheet
- In a new Google Sheet, the first thing we'll do is split the ImportXML function into two logical pieces; the URL and the XPath query. Write the URL in cell A1 and the XPath query in cell A2.
- If you want add another ImportXML function into this sheet so that you can do calculations with the data, add the second URL in cell A4 and the XPath query in cell A5.
- Give your Google Sheet a name.
Step 2: Write a Script
- Click on the Tools menu and select the option Script Editor...
- A Google Apps Script window will appear. Click the Blank Project option.
- Replace the default function with the script shown below:
- Double-check the following aspects of the first section of the script (
- Cell reference A1 should reflect where your first URL is written in your Google Sheet
- Cell reference A2 should reflect where your first XPath query is written in your Google Sheet
- Cell reference C1 should reflect where you want to display the result of your ImportXML function
- If you want to include a second ImportXML function, double-check the following in the second section of the script (
- Cell reference A4 should reflect where your first URL is written in your Google Sheet
- Cell reference A5 should reflect where your first XPath query is written in your Google Sheet
- Cell reference C2 should reflect where you want to display the result of your second ImportXML function
- If you don't want to include a second ImportXML function, delete everything from
...setValue(cellFunction2);leaving just the following script:
- Save your script and give it a name
Note: In case you're wondering why we're adding a random number to the end of our URL, it's because otherwise the website will be cached and the function stops importing the latest data.
Step 3: Set up a Trigger
- From inside the Script Editor, click on the Resources menu and select the option Current project's triggers
- In the window that appears, click the link No triggers set up. Click here to add one now.
- This will create a trigger for your function. Change the default Hour timer to Minutes timer.
- Click Save
- You'll be prompted to authorize your app by clicking Continue and then Accept. More information on Google Apps authorization can be found here.
- You can now close the Script Editor
Step 4: Add the Twitter and Facebook Totals Together
- The ImportXML function for our Facebook likes has put the number of likes and the word "likes" in one cell so we’ll need to split into separate cells before we can do a calculation on the number of likes
- We’ll use the Google Sheets Split function to split the text in cell C2 wherever it finds the space character
=SPLIT(C2, " ")
- Now that both the number of Twitter followers and Facebook likes are in their own cells, we’ll add a ‘Total’ heading in cell A3 and then use the Google Sheets SUM function in cell B3 to add these two results together and give us an overall total
- At this point, give your Google Sheet a name
Step 5: Add a Spreadsheets Number Widget in Geckoboard
- From your Geckoboard dashboard, click + Add widget
- Choose the Spreadsheets integration
- Select the Number widget
- Select the Google Sheets file that you've just created (ImportXML - Automatic Update in this example)
- Select the appropriate Sheet (Sheet1 in this example)
- Type C4 in the Number selection (which contains the total value of likes + followers in the above example)
- Configure the other widget options. Disable the Comparison Visualization and adjust the title and size
- Click Add to dashboard to add your widget to your dashboard
Step 6: The Result!
The Geckoboard dashboard is now displaying the information found in Cell C4 of the Google Sheet - our total Twitter and Facebook followers.
Now that you've written the script and set the trigger running, you'll be able to see the data on your Google Sheet refresh every minute. This will continue to refresh even when your Google Sheet is closed. Your Geckoboard Google Sheet Number widget will refresh every 10 minutes with the new data from your Google Sheet.
Note: We've found that, a few times a week, a server error will occur on Google's end and the script will fail to run. By default, you'll receive an email to notify you that this has happened. Considering the script runs every minute of the day which totals 1440 times, a few times a week is nothing to be concerned about.