Skip to content

Latest commit

 

History

History
182 lines (139 loc) · 5.36 KB

README.rst

File metadata and controls

182 lines (139 loc) · 5.36 KB

pgsheets : Manipulate Google Sheets Using Python

pgsheets is a Python3 library for interacting with Google Sheets. It makes use of Pandas DataFrames, 2-dimensional structures perfectly suited for data analysis and representing a spreadsheet.

This library can be integrated easily with your existing data to present dashboards, update documents, or provide quick data analysis.

Features

  • Get / Set all or part of a Google Sheet
  • Manage authorization with Google API
  • Retrieve/set formulas or values
  • Resize spreadsheets
  • Add worksheets to and remove worksheets from a spreadsheet
  • Open up a wealth of Pandas data tools to use on Google Sheets

Installation

Simply install with pip:

$ pip install pgsheets

Usage

Setting up a Google Project

If you haven't already you will need to create a project in Google's Developer Console and get your Client ID and Client Secret.

  1. Navigate to the Google Developer Console
  2. Create a project (you will be redirected to the project page)
  3. Click on APIs & Auth
  4. Click on Consent screen and set a Product Name
  5. Now click on credentials.
  6. Click Create new Client ID and select Installed Application > Other

Getting a Token

Using your Google client id and client secret we can get a authorization URL to present to a user:

>>> from pgsheets import Token, Client
>>> c = Client(my_client_id, my_client_secret)
>>> c.getOauthUrl()
'https://accounts.google.com/o/oauth2/auth?...'

By visiting this URL a Google user can consent to your application viewing and modifying their Google sheets. After consenting to this an access code is returned, which we use to get a token:

>>> my_token = c.getRefreshToken(access_code)
>>> type(my_token)
str
>>> t = Token(c, my_token)

You need to save my_token for future use.

Editing a spreadsheet

Create a spreadsheet (and make sure you save it) and copy the url. Now we can access the Spreadsheet:

>>> import pandas as pd
>>> from pgsheets import Spreadsheet
>>> s = Spreadsheet(t, my_url)
>>> s
<Spreadsheet title='test' key='.....'>
>>> s.getTitle()
'test'
>>> s.getWorksheets()
[<Worksheet title='Sheet1' sheet_key='.....'>]
>>> w = s.getWorksheet('Sheet1')
>>> w.getTitle()
'Sheet1'
>>> w.setDataFrame(
        pd.DataFrame([['money', 'interest', 'years', 'result'],
                      ['1000', '0.015', '3', '=A2 * (1+B2) ^ C2']]),
        copy_columns=False,
        copy_index=False,
        resize=True)
>>> w.asDataFrame()
          interest years                                  result
    money
    1000     0.015     3  =R[0]C[-3] * (1+R[0]C[-2]) ^ R[0]C[-1]
>>> w.asDataFrame(values=True)
          interest years       result
    money
    1000     0.015     3  1045.678375
>>> df = w.asDataFrame()
>>> df['checked'] = "TRUE"
>>> w.setDataFrame(df)
>>> w.asDataFrame()
          interest years                                  result checked
    money
    1000     0.015     3  =R[0]C[-3] * (1+R[0]C[-2]) ^ R[0]C[-1]    TRUE

Adding or Removing Worksheets

Add a worksheet with addWorksheet(), and remove a Worksheet object with removeWorksheet():

>>> import pandas as pd
>>> from pgsheets import Spreadsheet
>>> s = Spreadsheet(t, my_url)
>>> s
<Spreadsheet title='test' key='.....'>
>>> s.getWorksheets()
[<Worksheet title='Sheet1' sheet_key='.....'>]
>>> w = s.addWorksheet('My Title')
<Worksheet title='My Title' sheet_key='.....'>
>>> w.getTitle()
'My Title'
>>> s.getWorksheets()
[<Worksheet title='Sheet1' sheet_key='.....'>, <Worksheet title='My Title' sheet_key='.....'>]
>>> s.removeWorksheet(w)
>>> s.getWorksheets()
[<Worksheet title='Sheet1' sheet_key='.....'>]

Limitations

The library has only been tested in Python3.4. It will almost certainly not work in Python2.

Currently the following cannot be done with pgsheets:

  • Create a spreadsheet
  • Rename a spreadsheet or a worksheet
  • Prevent certain values from changing slightly e.g. 'True' becomes 'TRUE'

Finally the Google API has some limitations. Ideally this code should not cause any changes to a worksheet:

>>> w.setDataFrame(w.asDataFrame())

Unfortunately, there are certain edge cases. For example, with a Formula such as the following

={1, 2}  

which displays across two cells:

1 2

There is no clear way to know that the cell on the right wasn't input as a '2' by the user. Thus the above code would cause the following output:

={1, 2} 2

which displays as:

#REF! 2