Skip to content

Commit

Permalink
Merge pull request #434 from liam-hq/use_cheerio_to_extract_project_t…
Browse files Browse the repository at this point in the history
…itle

♻️ Add cheerio for HTML parsing in metadata extraction
  • Loading branch information
MH4GF authored Jan 9, 2025
2 parents 2a5eafc + f51453b commit c19c514
Show file tree
Hide file tree
Showing 4 changed files with 385 additions and 8 deletions.
215 changes: 212 additions & 3 deletions docs/packages-license.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@


## Summary
* 965 MIT
* 975 MIT
* 68 Apache 2.0
* 60 ISC
* 19 Simplified BSD
* 61 ISC
* 27 Simplified BSD
* 17 New BSD
* 3 MIT OR Apache-2.0
* 3 BlueOak-1.0.0
Expand Down Expand Up @@ -4002,6 +4002,17 @@ Python-2.0 manually approved



<a name="boolbase"></a>
### boolbase v1.0.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://en.wikipedia.org/wiki/ISC_license">ISC</a> permitted



<a name="brace-expansion"></a>
### brace-expansion v1.1.11
####
Expand Down Expand Up @@ -4288,6 +4299,28 @@ CC-BY-4.0 permitted



<a name="cheerio"></a>
### cheerio v1.0.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/mit-license">MIT</a> permitted



<a name="cheerio-select"></a>
### cheerio-select v2.1.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/bsd-license">Simplified BSD</a> permitted



<a name="chokidar"></a>
### chokidar v3.6.0
####
Expand Down Expand Up @@ -4684,6 +4717,28 @@ CC-BY-4.0 permitted



<a name="css-select"></a>
### css-select v5.1.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/bsd-license">Simplified BSD</a> permitted



<a name="css-what"></a>
### css-what v6.1.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/bsd-license">Simplified BSD</a> permitted



<a name="css.escape"></a>
### css.escape v1.5.1
####
Expand Down Expand Up @@ -5157,6 +5212,50 @@ CC-BY-4.0 permitted



<a name="dom-serializer"></a>
### dom-serializer v2.0.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/mit-license">MIT</a> permitted



<a name="domelementtype"></a>
### domelementtype v2.3.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/bsd-license">Simplified BSD</a> permitted



<a name="domhandler"></a>
### domhandler v5.0.3
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/bsd-license">Simplified BSD</a> permitted



<a name="domutils"></a>
### domutils v3.2.2
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/bsd-license">Simplified BSD</a> permitted



<a name="dot-case"></a>
### dot-case v2.1.1
####
Expand Down Expand Up @@ -5267,6 +5366,17 @@ CC-BY-4.0 permitted



<a name="encoding-sniffer"></a>
### encoding-sniffer v0.2.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/mit-license">MIT</a> permitted



<a name="enhanced-resolve"></a>
### enhanced-resolve v5.17.1
####
Expand All @@ -5289,6 +5399,17 @@ CC-BY-4.0 permitted



<a name="entities"></a>
### entities v4.5.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/bsd-license">Simplified BSD</a> permitted



<a name="env-paths"></a>
### env-paths v2.2.1
####
Expand Down Expand Up @@ -6664,6 +6785,17 @@ CC-BY-4.0 permitted



<a name="htmlparser2"></a>
### htmlparser2 v9.1.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/mit-license">MIT</a> permitted



<a name="http-errors"></a>
### http-errors v2.0.0
####
Expand Down Expand Up @@ -9044,6 +9176,17 @@ Public Domain manually approved



<a name="nth-check"></a>
### nth-check v2.1.1
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/bsd-license">Simplified BSD</a> permitted



<a name="object-assign"></a>
### object-assign v4.1.1
####
Expand Down Expand Up @@ -9396,6 +9539,39 @@ BlueOak-1.0.0 permitted



<a name="parse5"></a>
### parse5 v7.2.1
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/mit-license">MIT</a> permitted



<a name="parse5-htmlparser2-tree-adapter"></a>
### parse5-htmlparser2-tree-adapter v7.1.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/mit-license">MIT</a> permitted



<a name="parse5-parser-stream"></a>
### parse5-parser-stream v7.1.2
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/mit-license">MIT</a> permitted



<a name="parseurl"></a>
### parseurl v1.3.3
####
Expand Down Expand Up @@ -11930,6 +12106,17 @@ Unknown manually approved



<a name="undici"></a>
### undici v6.21.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/mit-license">MIT</a> permitted



<a name="undici-types"></a>
### undici-types v6.19.8
####
Expand Down Expand Up @@ -12403,6 +12590,28 @@ Unknown manually approved



<a name="whatwg-encoding"></a>
### whatwg-encoding v3.1.1
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/mit-license">MIT</a> permitted



<a name="whatwg-mimetype"></a>
### whatwg-mimetype v4.0.0
####

##### Paths
* /home/runner/work/liam/liam

<a href="http://opensource.org/licenses/mit-license">MIT</a> permitted



<a name="whatwg-url"></a>
### whatwg-url v5.0.0
####
Expand Down
10 changes: 5 additions & 5 deletions frontend/apps/erd-web/app/erd/p/[...slug]/page.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import {
supportedFormatSchema,
} from '@liam-hq/db-structure/parser'
import * as Sentry from '@sentry/nextjs'
import { load } from 'cheerio'
import type { Metadata } from 'next'
import { cookies } from 'next/headers'
import { notFound } from 'next/navigation'
Expand Down Expand Up @@ -54,11 +55,10 @@ export async function generateMetadata({
const projectName = await (async () => {
if (res?.ok) {
const html = await res.text()
const ogTitleMatch = html.match(
/<meta property="og:title" content="([^"]+)" \/>/,
)
const htmlTitleMatch = html.match(/<title>([^<]+)<\/title>/)
return ogTitleMatch?.[1] ?? htmlTitleMatch?.[1] ?? joinedPath
const $ = load(html)
const ogTitle = $('meta[property="og:title"]').attr('content')
const htmlTitle = $('title').text()
return ogTitle || htmlTitle || joinedPath
}
return joinedPath
})()
Expand Down
1 change: 1 addition & 0 deletions frontend/apps/erd-web/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
"@liam-hq/db-structure": "workspace:*",
"@liam-hq/erd-core": "workspace:*",
"@sentry/nextjs": "8",
"cheerio": "1.0.0",
"next": "15.1.2",
"react": "18.3.1",
"react-dom": "18",
Expand Down
Loading

0 comments on commit c19c514

Please sign in to comment.