Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add blockquote functionality #73

Merged
merged 7 commits into from
Nov 15, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions parser/__tests__/parser.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -476,6 +476,63 @@ describe("parseParagraph", () => {
const nestingSpacer = " ";
expect(result2).toEqual(nestingSpacer + "1. Hello, world!");
});

it("should format indented text as block quotes", () => {
const indentedParagraph = {
elements: [{ textRun: { content: "This is an indented quote" } }],
paragraphStyle: {
namedStyleType: "NORMAL_TEXT",
indentStart: { magnitude: 36 }, // Standard indentation button level
},
};
const result = parseParagraph(documentContext)(indentedParagraph);
expect(result).toEqual("> This is an indented quote");
});

it("should not format text with small indentation as block quotes", () => {
const slightlyIndentedParagraph = {
elements: [{ textRun: { content: "This has small indentation" } }],
paragraphStyle: {
namedStyleType: "NORMAL_TEXT",
indentStart: { magnitude: 10 }, // Below our threshold of 18
},
};
const result = parseParagraph(documentContext)(slightlyIndentedParagraph);
expect(result).toEqual("This has small indentation");
});

it("should handle multiline block quotes", () => {
const multilineParagraph = {
elements: [
{ textRun: { content: "First line\nSecond line\nThird line" } },
],
paragraphStyle: {
namedStyleType: "NORMAL_TEXT",
indentStart: { magnitude: 36 },
},
};
const result = parseParagraph(documentContext)(multilineParagraph);
expect(result).toEqual("> First line\n> Second line\n> Third line");
});

it("should handle block quotes with other formatting", () => {
const formattedParagraph = {
elements: [
{
textRun: {
content: "Bold quote",
textStyle: { bold: true },
},
},
],
paragraphStyle: {
namedStyleType: "NORMAL_TEXT",
indentStart: { magnitude: 36 },
},
};
const result = parseParagraph(documentContext)(formattedParagraph);
expect(result).toEqual("> **Bold quote**");
});
});

describe("parseDoc", () => {
Expand Down
15 changes: 12 additions & 3 deletions parser/parser.js
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,12 @@ export const mergeSameElements = (elements) =>

export const parseParagraph = (documentContext) => (paragraph) => {
const { elements, ...paragraphContext } = paragraph;
const paragraphStyleName = paragraphContext.paragraphStyle?.namedStyleType;
const paragraphStyle = paragraphContext.paragraphStyle || {};
const paragraphStyleName = paragraphStyle.namedStyleType;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this part, it was recommended by Claude

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok. In both the previous and the new code, if paragraphContext.paragraphStyle then paragraphStyleName will undefined. I think this change is just to have paragraphStyle as a separate variable to be able to use it below.


// Check indentation - Google Docs API provides this in PT units
const indentStart = paragraphStyle.indentStart?.magnitude || 0;
const QUOTE_INDENT_THRESHOLD = 18; // Standard indentation button is 36pt, we test lower

let md = mergeSameElements(elements).map(
parseElement({ documentContext, paragraphContext })
Expand All @@ -250,7 +255,7 @@ export const parseParagraph = (documentContext) => (paragraph) => {
let leadingSpace = "";

// First we check if the "paragraph" is a heading, because the markdown for a heading is the first thing we need to output
if (paragraphStyleName.indexOf("HEADING_") === 0) {
if (paragraphStyleName?.indexOf("HEADING_") === 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is separate bug fix or does it relate to the blockquote functionality?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous version, the ? was integrated directly in the declaration of the variable. This seems to be an update (recommended by Claude) to keep the same functionality. I'm surprised that namedStyleType is not used for it, but I don't understand what this part does.

const headingLevel = parseInt(paragraphStyleName[8]);
const headingPrefix = new Array(headingLevel).fill("#").join("") + " ";
prefix = headingPrefix;
Expand Down Expand Up @@ -296,11 +301,15 @@ export const parseParagraph = (documentContext) => (paragraph) => {
md.join("").replaceAll("\n", "\n" + leadingSpace + " ")
);
} else {
// Add quote marker if the paragraph is indented beyond our threshold
const isQuote = indentStart >= QUOTE_INDENT_THRESHOLD;
const quotePrefix = isQuote ? "> " : "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of a nit so no problem if you want to keep it as is, but a consideration could be that this is introducing a nested conditional (the ternary in the else). If text classifications are mutually exclusive (i.e. the text can either be bullet, block quote or normal) then having the conditionals in a single level chain can make this clearer, IMO.

To lift the conditional, the isQuote logic could be moved next to where indentStart and QUOTE_INDENT_THRESHOLD are declared (might be good to have them all together anyways) or a small isQuote(paragraphStyle) function could be defined.

// Add quote marker if the paragraph is indented beyond our threshold
const isQuote = indentStart >= QUOTE_INDENT_THRESHOLD;

Then the conditional would just be

Suggested change
} else {
// Add quote marker if the paragraph is indented beyond our threshold
const isQuote = indentStart >= QUOTE_INDENT_THRESHOLD;
const quotePrefix = isQuote ? "> " : "";
} else if (isQuote) {
const quotePrefix = "> ";

Note an else statement with a separate return would still be necessary for normal text.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up defining quotePrefix everywhere as an empty string, and updating it if needed. LMK what you think of this type of logic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, current logic looks good to me 👍

return (
leadingSpace +
itemMarker +
quotePrefix +
prefix +
md.join("").replaceAll("\n", "\n" + leadingSpace)
md.join("").replaceAll("\n", "\n" + leadingSpace + quotePrefix)
);
}
};
Expand Down
Loading