Skip to content

Commit

Permalink
有道云docx转换markdown,导入hugo发布到github page,多平台发布适配
Browse files Browse the repository at this point in the history
  • Loading branch information
CYRUS-STUDIO committed Aug 10, 2024
1 parent d5af1e4 commit 7fe91ae
Show file tree
Hide file tree
Showing 9 changed files with 375 additions and 583 deletions.

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions public/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,8 @@
<meta property="og:type" content="website">

<meta itemprop="name" content="CYRUS STUDIO">
<meta itemprop="datePublished" content="2024-08-10T16:09:41+08:00">
<meta itemprop="dateModified" content="2024-08-10T16:09:41+08:00">
<meta itemprop="datePublished" content="2024-08-11T05:06:18+08:00">
<meta itemprop="dateModified" content="2024-08-11T05:06:18+08:00">
<meta name="twitter:card" content="summary">
<meta name="twitter:title" content="CYRUS STUDIO">

Expand Down Expand Up @@ -127,8 +127,8 @@ <h1 class="flex-none">

<div class="blah w-100">
<h1 class="f3 fw1 athelas mt0 lh-title">
<a href="/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91%E5%AF%BC%E5%87%BAdocxdocx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page/" class="color-inherit dim link">
有道云导出docx,docx转换markdown,导入hugo发布到github page
<a href="/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91docx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page%E5%A4%9A%E5%B9%B3%E5%8F%B0%E5%8F%91%E5%B8%83%E9%80%82%E9%85%8D/" class="color-inherit dim link">
有道云docx转换markdown,导入hugo发布到github page,多平台发布适配
</a>
</h1>
<div class="f6 f5-l lh-copy nested-copy-line-height nested-links">
Expand All @@ -144,7 +144,7 @@ <h1 class="f3 fw1 athelas mt0 lh-title">
def ynote_docx_markdown_transform(document):...passresult = convert_to_markdown(docx_file, transform_document=ynote_docx_markdown_transform) 通过在自定义 transform 断点调试可以看到 document 都是由一个一个 Paragraph 组成的,代码块的 Bookmark 的 name 都是相同的,由此代码块其中一个特征就是相同且相邻的 Bookmark name。 但是有的代码块只是单独的一段 这时可以通过自定义 代码/bash 特征判断该 Paragraph 中的 Text 是不是一段 代码/bash。
def is_possible_code_or_bash(text):# 常见的代码关键字code_keywords = [r&#39;\bif\b&#39;, r&#39;\bfor\b&#39;, r&#39;\bwhile\b&#39;, r&#39;\bdef\b&#39;, r&#39;\bclass\b&#39;, r&#39;\breturn\b&#39;, r&#39;\bimport\b&#39;,r&#39;\bint\b&#39;, r&#39;\bfloat\b&#39;, r&#39;\bmain\b&#39;, r&#39;\binclude\b&#39;, r&#39;#include&#39;, r&#39;\becho\b&#39;, r&#39;\bcd\b&#39;,r&#39;\bgrep\b&#39;, r&#39;\bexit\b&#39;, r&#39;\belse\b&#39;, r&#39;\belif\b&#39;, r&#39;#!
</div>
<a href="/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91%E5%AF%BC%E5%87%BAdocxdocx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page/" class="ba b--moon-gray bg-light-gray br2 color-inherit dib f7 hover-bg-moon-gray link mt2 ph2 pv1">read more</a>
<a href="/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91docx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page%E5%A4%9A%E5%B9%B3%E5%8F%B0%E5%8F%91%E5%B8%83%E9%80%82%E9%85%8D/" class="ba b--moon-gray bg-light-gray br2 color-inherit dib f7 hover-bg-moon-gray link mt2 ph2 pv1">read more</a>

</div>
</div>
Expand Down
10 changes: 5 additions & 5 deletions public/index.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@
<description>Recent content on CYRUS STUDIO</description>
<generator>Hugo</generator>
<language>zh-CN</language>
<lastBuildDate>Sat, 10 Aug 2024 16:09:41 +0800</lastBuildDate>
<lastBuildDate>Sun, 11 Aug 2024 05:06:18 +0800</lastBuildDate>
<atom:link href="https://cyrus-studio.github.io/blog/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>有道云导出docx,docx转换markdown,导入hugo发布到github page</title>
<link>https://cyrus-studio.github.io/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91%E5%AF%BC%E5%87%BAdocxdocx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page/</link>
<pubDate>Sat, 10 Aug 2024 16:09:41 +0800</pubDate>
<guid>https://cyrus-studio.github.io/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91%E5%AF%BC%E5%87%BAdocxdocx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page/</guid>
<title>有道云docx转换markdown,导入hugo发布到github page,多平台发布适配</title>
<link>https://cyrus-studio.github.io/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91docx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page%E5%A4%9A%E5%B9%B3%E5%8F%B0%E5%8F%91%E5%B8%83%E9%80%82%E9%85%8D/</link>
<pubDate>Sun, 11 Aug 2024 05:06:18 +0800</pubDate>
<guid>https://cyrus-studio.github.io/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91docx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page%E5%A4%9A%E5%B9%B3%E5%8F%B0%E5%8F%91%E5%B8%83%E9%80%82%E9%85%8D/</guid>
<description>有道云导出docx&#xA;有道云笔记右上角更多按钮选择【导出为Word】,可以导出docx文档 docx转换markdown&#xA;尝试了几个docx转markdown的python库后,最终选择了python-mammoth,轻量,效率高,可自定义转换满足特定需求。&#xA;python-mammoth&#xA;python-mammoth 是一个用于将 Microsoft Word (DOCX) 文档转换为 HTML 或 Markdown 的 Python 库。&#xA;github地址:https://github.com/mwilliamson/python-mammoth&#xA;安装 python-mammoth&#xA;pip install mammoth 自定义代码块样式&#xA;通过自定义 transform 来实现自定义的代码块样式来支持有道云docx的代码块&#xA;def ynote_docx_markdown_transform(document):&#xD;...&#xD;pass&#xD;result = convert_to_markdown(docx_file, transform_document=ynote_docx_markdown_transform) 通过在自定义 transform 断点调试可以看到 document 都是由一个一个 Paragraph 组成的,代码块的 Bookmark 的 name 都是相同的,由此代码块其中一个特征就是相同且相邻的 Bookmark name。 但是有的代码块只是单独的一段 这时可以通过自定义 代码/bash 特征判断该 Paragraph 中的 Text 是不是一段 代码/bash。&#xA;def is_possible_code_or_bash(text):&#xD;# 常见的代码关键字&#xD;code_keywords = [&#xD;r&amp;#39;\bif\b&amp;#39;, r&amp;#39;\bfor\b&amp;#39;, r&amp;#39;\bwhile\b&amp;#39;, r&amp;#39;\bdef\b&amp;#39;, r&amp;#39;\bclass\b&amp;#39;, r&amp;#39;\breturn\b&amp;#39;, r&amp;#39;\bimport\b&amp;#39;,&#xD;r&amp;#39;\bint\b&amp;#39;, r&amp;#39;\bfloat\b&amp;#39;, r&amp;#39;\bmain\b&amp;#39;, r&amp;#39;\binclude\b&amp;#39;, r&amp;#39;#include&amp;#39;, r&amp;#39;\becho\b&amp;#39;, r&amp;#39;\bcd\b&amp;#39;,&#xD;r&amp;#39;\bgrep\b&amp;#39;, r&amp;#39;\bexit\b&amp;#39;, r&amp;#39;\belse\b&amp;#39;, r&amp;#39;\belif\b&amp;#39;, r&amp;#39;#!</description>
</item>
<item>
Expand Down
10 changes: 5 additions & 5 deletions public/posts/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,8 @@
<meta property="og:type" content="website">

<meta itemprop="name" content="Posts">
<meta itemprop="datePublished" content="2024-08-10T16:09:41+08:00">
<meta itemprop="dateModified" content="2024-08-10T16:09:41+08:00">
<meta itemprop="datePublished" content="2024-08-11T05:06:18+08:00">
<meta itemprop="dateModified" content="2024-08-11T05:06:18+08:00">
<meta name="twitter:card" content="summary">
<meta name="twitter:title" content="Posts">

Expand Down Expand Up @@ -101,8 +101,8 @@ <h1 class="f2 f-subheadline-l fw2 light-silver mb0 lh-title">
<div class="bg-white mb3 pa4 gray overflow-hidden">
<span class="f6 db">Posts</span>
<h1 class="f3 near-black">
<a href="/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91%E5%AF%BC%E5%87%BAdocxdocx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page/" class="link black dim">
有道云导出docx,docx转换markdown,导入hugo发布到github page
<a href="/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91docx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page%E5%A4%9A%E5%B9%B3%E5%8F%B0%E5%8F%91%E5%B8%83%E9%80%82%E9%85%8D/" class="link black dim">
有道云docx转换markdown,导入hugo发布到github page,多平台发布适配
</a>
</h1>
<div class="nested-links f5 lh-copy nested-copy-line-height">
Expand All @@ -118,7 +118,7 @@ <h1 class="f3 near-black">
def ynote_docx_markdown_transform(document):...passresult = convert_to_markdown(docx_file, transform_document=ynote_docx_markdown_transform) 通过在自定义 transform 断点调试可以看到 document 都是由一个一个 Paragraph 组成的,代码块的 Bookmark 的 name 都是相同的,由此代码块其中一个特征就是相同且相邻的 Bookmark name。 但是有的代码块只是单独的一段 这时可以通过自定义 代码/bash 特征判断该 Paragraph 中的 Text 是不是一段 代码/bash。
def is_possible_code_or_bash(text):# 常见的代码关键字code_keywords = [r&#39;\bif\b&#39;, r&#39;\bfor\b&#39;, r&#39;\bwhile\b&#39;, r&#39;\bdef\b&#39;, r&#39;\bclass\b&#39;, r&#39;\breturn\b&#39;, r&#39;\bimport\b&#39;,r&#39;\bint\b&#39;, r&#39;\bfloat\b&#39;, r&#39;\bmain\b&#39;, r&#39;\binclude\b&#39;, r&#39;#include&#39;, r&#39;\becho\b&#39;, r&#39;\bcd\b&#39;,r&#39;\bgrep\b&#39;, r&#39;\bexit\b&#39;, r&#39;\belse\b&#39;, r&#39;\belif\b&#39;, r&#39;#!
</div>
<a href="/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91%E5%AF%BC%E5%87%BAdocxdocx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page/" class="ba b--moon-gray bg-light-gray br2 color-inherit dib f7 hover-bg-moon-gray link mt2 ph2 pv1">read more</a>
<a href="/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91docx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page%E5%A4%9A%E5%B9%B3%E5%8F%B0%E5%8F%91%E5%B8%83%E9%80%82%E9%85%8D/" class="ba b--moon-gray bg-light-gray br2 color-inherit dib f7 hover-bg-moon-gray link mt2 ph2 pv1">read more</a>
</div>
</div>

Expand Down
10 changes: 5 additions & 5 deletions public/posts/index.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@
<description>Recent content in Posts on CYRUS STUDIO</description>
<generator>Hugo</generator>
<language>zh-CN</language>
<lastBuildDate>Sat, 10 Aug 2024 16:09:41 +0800</lastBuildDate>
<lastBuildDate>Sun, 11 Aug 2024 05:06:18 +0800</lastBuildDate>
<atom:link href="https://cyrus-studio.github.io/blog/posts/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>有道云导出docx,docx转换markdown,导入hugo发布到github page</title>
<link>https://cyrus-studio.github.io/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91%E5%AF%BC%E5%87%BAdocxdocx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page/</link>
<pubDate>Sat, 10 Aug 2024 16:09:41 +0800</pubDate>
<guid>https://cyrus-studio.github.io/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91%E5%AF%BC%E5%87%BAdocxdocx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page/</guid>
<title>有道云docx转换markdown,导入hugo发布到github page,多平台发布适配</title>
<link>https://cyrus-studio.github.io/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91docx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page%E5%A4%9A%E5%B9%B3%E5%8F%B0%E5%8F%91%E5%B8%83%E9%80%82%E9%85%8D/</link>
<pubDate>Sun, 11 Aug 2024 05:06:18 +0800</pubDate>
<guid>https://cyrus-studio.github.io/blog/posts/%E6%9C%89%E9%81%93%E4%BA%91docx%E8%BD%AC%E6%8D%A2markdown%E5%AF%BC%E5%85%A5hugo%E5%8F%91%E5%B8%83%E5%88%B0github-page%E5%A4%9A%E5%B9%B3%E5%8F%B0%E5%8F%91%E5%B8%83%E9%80%82%E9%85%8D/</guid>
<description>有道云导出docx&#xA;有道云笔记右上角更多按钮选择【导出为Word】,可以导出docx文档 docx转换markdown&#xA;尝试了几个docx转markdown的python库后,最终选择了python-mammoth,轻量,效率高,可自定义转换满足特定需求。&#xA;python-mammoth&#xA;python-mammoth 是一个用于将 Microsoft Word (DOCX) 文档转换为 HTML 或 Markdown 的 Python 库。&#xA;github地址:https://github.com/mwilliamson/python-mammoth&#xA;安装 python-mammoth&#xA;pip install mammoth 自定义代码块样式&#xA;通过自定义 transform 来实现自定义的代码块样式来支持有道云docx的代码块&#xA;def ynote_docx_markdown_transform(document):&#xD;...&#xD;pass&#xD;result = convert_to_markdown(docx_file, transform_document=ynote_docx_markdown_transform) 通过在自定义 transform 断点调试可以看到 document 都是由一个一个 Paragraph 组成的,代码块的 Bookmark 的 name 都是相同的,由此代码块其中一个特征就是相同且相邻的 Bookmark name。 但是有的代码块只是单独的一段 这时可以通过自定义 代码/bash 特征判断该 Paragraph 中的 Text 是不是一段 代码/bash。&#xA;def is_possible_code_or_bash(text):&#xD;# 常见的代码关键字&#xD;code_keywords = [&#xD;r&amp;#39;\bif\b&amp;#39;, r&amp;#39;\bfor\b&amp;#39;, r&amp;#39;\bwhile\b&amp;#39;, r&amp;#39;\bdef\b&amp;#39;, r&amp;#39;\bclass\b&amp;#39;, r&amp;#39;\breturn\b&amp;#39;, r&amp;#39;\bimport\b&amp;#39;,&#xD;r&amp;#39;\bint\b&amp;#39;, r&amp;#39;\bfloat\b&amp;#39;, r&amp;#39;\bmain\b&amp;#39;, r&amp;#39;\binclude\b&amp;#39;, r&amp;#39;#include&amp;#39;, r&amp;#39;\becho\b&amp;#39;, r&amp;#39;\bcd\b&amp;#39;,&#xD;r&amp;#39;\bgrep\b&amp;#39;, r&amp;#39;\bexit\b&amp;#39;, r&amp;#39;\belse\b&amp;#39;, r&amp;#39;\belif\b&amp;#39;, r&amp;#39;#!</description>
</item>
<item>
Expand Down
156 changes: 0 additions & 156 deletions public/posts/my-first-post/index.html

This file was deleted.

Loading

0 comments on commit 7fe91ae

Please sign in to comment.