Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add transliteration module to support non-latin URLs #423

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

marharyta
Copy link

Problem: getCanonicalPageId does not support non-latin page titles

Issue:

I am using [Notion.so](http://Notion.so) to run [FinUA.org](http://FinUA.org) website and currently it isb deployed with [super.so](http://super.so). I have been using nextjs-notion-starter-kit project for it (thank you).

As deployed the project to Vercel, I realized that there were quite a few browser warnings about the page due to generated page URLs (they looked broken).
Screenshot 2023-01-21 at 18 41 25

the page behind it:
Screenshot 2023-01-21 at 18 41 37

moreover, this page also had the same URL generated /- despite being a separate page, and clicking on it would lead to the first page.

Screenshot 2023-01-21 at 18 42 13

I have investigated it, and it seems that the problem was in the module https://github.com/transitive-bullshit/nextjs-notion-starter-kit/blob/main/lib/get-canonical-page-id.ts

import { ExtendedRecordMap } from 'notion-types'
import {
  getCanonicalPageId as getCanonicalPageIdImpl,
  parsePageId
} from 'notion-utils'

import { inversePageUrlOverrides } from './config'

export function getCanonicalPageId(
  pageId: string,
  recordMap: ExtendedRecordMap,
  { uuid = true }: { uuid?: boolean } = {}
): string | null {
  const cleanPageId = parsePageId(pageId, { uuid: false })
  if (!cleanPageId) {
    return null
  }

  const override = inversePageUrlOverrides[cleanPageId]
  if (override) {
    return override
  } else {
		// PROBLEM: this line seemed to be the issue
    return getCanonicalPageIdImpl(pageId, recordMap, {
      uuid
    })
  }
}

I went to the module https://github.com/NotionX/react-notion-x/tree/master/packages/notion-utils

and copied https://github.com/NotionX/react-notion-x/blob/master/packages/notion-utils/src/get-canonical-page-id.ts module, the problem seemed to be getCanonicalPageId function, it only seemed to work for Latin symbols normalizeTitle(getBlockTitle(block, recordMap)):

I pulled the normalizeTitle function, and yes, it seems to be the case

function normalizeTitle(title) {
  return (title || '')
    .replace(/ /g, '-')
    .replace(/[^a-zA-Z0-9-\u4e00-\u9fa5]/g, '')
    .replace(/--/g, '-')
    .replace(/-$/, '')
    .replace(/^-/, '')
    .trim()
    .toLowerCase()
}

const eng = normalizeTitle('Naapurin Maalaiskana (NMK), in Lieto, in Turku area');
const ukr = normalizeTitle('Робота помічника з обслуговування контейнерів');
const ukr1 = normalizeTitle('Ищем литейщиков в Карккила, Финляндия, для обработки изделий в металлургической промышленности');
console.log('test', eng, ukr, ukr1)

// "test"
// "naapurin-maalaiskana-nmk-in-lieto-in-turku-area"
// ""
// "---"

Solution:

The one that worked for me was just replacing normalizeTitle(getBlockTitle(block, recordMap)) with slugify from the transliteration npm package.

Notion Test Page ID

701245d6db8c413689d180e87269ee56

@vercel
Copy link

vercel bot commented Jan 21, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated
react-notion-x ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Jan 21, 2023 at 5:05PM (UTC)
react-notion-x-minimal-demo ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Jan 21, 2023 at 5:05PM (UTC)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant