-
Notifications
You must be signed in to change notification settings - Fork 0
/
notes.Rmd
64 lines (49 loc) · 1.84 KB
/
notes.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
title: "BQ Patent Exploration"
output: github_document
---
```{r, echo = FALSE, message = FALSE, warning = FALSE}
knitr::opts_chunk$set(
comment = "#>",
collapse = TRUE,
warning = FALSE,
message = FALSE,
echo = TRUE
)
```
## About
Google Patents Public Datasets is a collection BigQuery database tables from government, research and private companies for conducting statistical analysis of patent data.
https://cloud.google.com/blog/products/gcp/google-patents-public-datasets-connecting-public-paid-and-private-patent-data
This notebook provides an overview about the structure of the BigQuery datatable. A particular focus is on accessing non-patent literature (NPL).
## Connect to Google Patents Public Datasets
```{r}
library(tidyverse)
library(DBI)
library(bigrquery)
con <- dbConnect(
bigrquery::bigquery(),
project = "cogent-tangent-279810"
)
```
Alternatively, you can also use the Google BigQuery web GUI to query the data. <https://console.cloud.google.com/bigquery?project=api-project-764811344545&p=patents-public-data&d=patents&t=publications_201912&page=table>
## Patents with citing non-patent literature
```{sql, connection = con}
SELECT e.country_code, COUNT(DISTINCT(e.publication_number)) as n_patents
FROM `patents-public-data.patents.publications_201912` as e, UNNEST(citation) as d
WHERE d.npl_text != ''
GROUP BY e.country_code
ORDER BY n_patents desc
```
### Sample Patents
```{sql, connection = con}
SELECT e.country_code, e.publication_number, d.npl_text
FROM `patents-public-data.patents.publications_201912` as e, UNNEST(citation) as d
WHERE d.npl_text like '%doi%' and country_code = 'EP'
LIMIT 10
```
```{sql, connection = con}
SELECT e.country_code, e.publication_number, d.npl_text
FROM `patents-public-data.patents.publications_201912` as e, UNNEST(citation) as d
WHERE e.application_number= 'US7527788'
LIMIT 10
```