pgsheets is a Python3 library for interacting with Google Sheets. It makes use of Pandas DataFrames, 2-dimensional structures perfectly suited for data analysis and representing a spreadsheet.
This library can be integrated easily with your existing data to present dashboards, update documents, or provide quick data analysis.
- Get / Set all or part of a Google Sheet
- Manage authorization with Google API
- Retrieve/set formulas or values
- Resize spreadsheets
- Add worksheets to and remove worksheets from a spreadsheet
- Open up a wealth of Pandas data tools to use on Google Sheets
Simply install with pip:
$ pip install pgsheets
If you haven't already you will need to create a project in Google's Developer Console and get your Client ID and Client Secret.
- Navigate to the Google Developer Console
- Create a project (you will be redirected to the project page)
- Click on APIs & Auth
- Click on Consent screen and set a Product Name
- Now click on credentials.
- Click Create new Client ID and select Installed Application > Other
Using your Google client id and client secret we can get a authorization URL to present to a user:
>>> from pgsheets import Token, Client
>>> c = Client(my_client_id, my_client_secret)
>>> c.getOauthUrl()
'https://accounts.google.com/o/oauth2/auth?...'
By visiting this URL a Google user can consent to your application viewing and modifying their Google sheets. After consenting to this an access code is returned, which we use to get a token:
>>> my_token = c.getRefreshToken(access_code)
>>> type(my_token)
str
>>> t = Token(c, my_token)
You need to save my_token for future use.
Create a spreadsheet (and make sure you save it) and copy the url. Now we can access the Spreadsheet:
>>> import pandas as pd
>>> from pgsheets import Spreadsheet
>>> s = Spreadsheet(t, my_url)
>>> s
<Spreadsheet title='test' key='.....'>
>>> s.getTitle()
'test'
>>> s.getWorksheets()
[<Worksheet title='Sheet1' sheet_key='.....'>]
>>> w = s.getWorksheet('Sheet1')
>>> w.getTitle()
'Sheet1'
>>> w.setDataFrame(
pd.DataFrame([['money', 'interest', 'years', 'result'],
['1000', '0.015', '3', '=A2 * (1+B2) ^ C2']]),
copy_columns=False,
copy_index=False,
resize=True)
>>> w.asDataFrame()
interest years result
money
1000 0.015 3 =R[0]C[-3] * (1+R[0]C[-2]) ^ R[0]C[-1]
>>> w.asDataFrame(values=True)
interest years result
money
1000 0.015 3 1045.678375
>>> df = w.asDataFrame()
>>> df['checked'] = "TRUE"
>>> w.setDataFrame(df)
>>> w.asDataFrame()
interest years result checked
money
1000 0.015 3 =R[0]C[-3] * (1+R[0]C[-2]) ^ R[0]C[-1] TRUE
Add a worksheet with addWorksheet(), and remove a Worksheet object with removeWorksheet():
>>> import pandas as pd
>>> from pgsheets import Spreadsheet
>>> s = Spreadsheet(t, my_url)
>>> s
<Spreadsheet title='test' key='.....'>
>>> s.getWorksheets()
[<Worksheet title='Sheet1' sheet_key='.....'>]
>>> w = s.addWorksheet('My Title')
<Worksheet title='My Title' sheet_key='.....'>
>>> w.getTitle()
'My Title'
>>> s.getWorksheets()
[<Worksheet title='Sheet1' sheet_key='.....'>, <Worksheet title='My Title' sheet_key='.....'>]
>>> s.removeWorksheet(w)
>>> s.getWorksheets()
[<Worksheet title='Sheet1' sheet_key='.....'>]
The library has only been tested in Python3.4. It will almost certainly not work in Python2.
Currently the following cannot be done with pgsheets:
- Create a spreadsheet
- Rename a spreadsheet or a worksheet
- Prevent certain values from changing slightly e.g. 'True' becomes 'TRUE'
Finally the Google API has some limitations. Ideally this code should not cause any changes to a worksheet:
>>> w.setDataFrame(w.asDataFrame())
Unfortunately, there are certain edge cases. For example, with a Formula such as the following
={1, 2} |
which displays across two cells:
1 | 2 |
There is no clear way to know that the cell on the right wasn't input as a '2' by the user. Thus the above code would cause the following output:
={1, 2} | 2 |
which displays as:
#REF! | 2 |