Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Speed up method AddContentToPage.process_table by 10% in src/backend/base/langflow/components/Notion/add_content_to_page.py #85

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Dec 12, 2024

📄 AddContentToPage.process_table in src/backend/base/langflow/components/Notion/add_content_to_page.py

✨ Performance Summary:

  • Speed Increase: 📈 10% (0.10x faster)
  • Runtime Reduction: ⏱️ From 50.7 milliseconds down to 46.1 milliseconds (best of 38 runs)

📝 Explanation and details

To optimize and improve the speed of the program, we can refactor the create_block method to reduce redundancy in the if-elif chain. Here's the optimized version of the code.

Changes made.

  1. Combined repetitive code blocks for handling rich_text and similar attributes into a single if statement with an auxiliary set rich_text_block_types.
  2. Used update method for creating table block to improve readability and efficiency.
  3. Removed redundant code for assignment of each block type's properties by minimizing the number of if-elif statements and better grouping logic.

This rewriting should reduce redundant operations, thus leading to better performance and readability.


Correctness verification

The new optimized code was tested for correctness. The results are listed below:

Test Status Details
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 29 Passed See below
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Coverage 100.0%

🌀 Generated Regression Tests Details

Click to view details
from typing import Any

# imports
import pytest  # used for our unit tests
from bs4 import BeautifulSoup  # used to parse HTML
# function to test
from langflow.base.langchain_utilities.model import LCToolComponent
from langflow.components.Notion.add_content_to_page import AddContentToPage


# unit tests
@pytest.fixture
def add_content_to_page():
    return AddContentToPage()

def test_single_row_table_without_header(add_content_to_page):
    html = "<table><tbody><tr><td>Cell 1</td></tr></tbody></table>"
    node = BeautifulSoup(html, "html.parser").find("table")
    codeflash_output = add_content_to_page.process_table(node)

def test_single_row_table_with_header(add_content_to_page):
    html = "<table><thead><tr><th>Header 1</th></tr></thead><tbody><tr><td>Cell 1</td></tr></tbody></table>"
    node = BeautifulSoup(html, "html.parser").find("table")
    codeflash_output = add_content_to_page.process_table(node)

def test_empty_table(add_content_to_page):
    html = "<table></table>"
    node = BeautifulSoup(html, "html.parser").find("table")
    codeflash_output = add_content_to_page.process_table(node)

def test_table_with_various_cell_types(add_content_to_page):
    html = "<table><thead><tr><th>Header 1</th><td>Header 2</td></tr></thead><tbody><tr><th>Cell 1</th><td>Cell 2</td></tr></tbody></table>"
    node = BeautifulSoup(html, "html.parser").find("table")
    codeflash_output = add_content_to_page.process_table(node)

def test_table_with_nested_elements(add_content_to_page):
    html = "<table><tbody><tr><td><b>Bold Text</b></td></tr></tbody></table>"
    node = BeautifulSoup(html, "html.parser").find("table")
    codeflash_output = add_content_to_page.process_table(node)

def test_table_with_special_characters(add_content_to_page):
    html = "<table><tbody><tr><td>&amp;</td></tr></tbody></table>"
    node = BeautifulSoup(html, "html.parser").find("table")
    codeflash_output = add_content_to_page.process_table(node)


def test_irregular_table_structures(add_content_to_page):
    html = "<table><tbody><tr><td>Cell 1</td></tr><tr><td>Cell 2</td><td>Cell 3</td></tr></tbody></table>"
    node = BeautifulSoup(html, "html.parser").find("table")
    codeflash_output = add_content_to_page.process_table(node)

def test_table_with_empty_cells(add_content_to_page):
    html = "<table><tbody><tr><td></td></tr></tbody></table>"
    node = BeautifulSoup(html, "html.parser").find("table")
    codeflash_output = add_content_to_page.process_table(node)

def test_malformed_html(add_content_to_page):
    html = "<table><tbody><tr><td>Cell 1</td></tr></tbody>"
    node = BeautifulSoup(html, "html.parser").find("table")
    codeflash_output = add_content_to_page.process_table(node)

def test_non_table_input(add_content_to_page):
    html = "<div>Not a table</div>"
    node = BeautifulSoup(html, "html.parser").find("div")
    codeflash_output = add_content_to_page.process_table(node)

def test_table_with_merged_cells(add_content_to_page):
    html = "<table><tbody><tr><td rowspan='2'>Cell 1</td><td>Cell 2</td></tr><tr><td>Cell 3</td></tr></tbody></table>"
    node = BeautifulSoup(html, "html.parser").find("table")
    codeflash_output = add_content_to_page.process_table(node)

def test_table_with_multiple_headers(add_content_to_page):
    html = "<table><thead><tr><th>Header 1</th></tr><tr><th>Header 2</th></tr></thead><tbody><tr><td>Cell 1</td></tr></tbody></table>"
    node = BeautifulSoup(html, "html.parser").find("table")
    codeflash_output = add_content_to_page.process_table(node)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from typing import Any

# imports
import pytest  # used for our unit tests
from bs4 import BeautifulSoup  # used to parse HTML
# function to test
from langflow.base.langchain_utilities.model import LCToolComponent
from langflow.components.Notion.add_content_to_page import AddContentToPage


# unit tests
@pytest.fixture
def parser():
    return BeautifulSoup

def test_single_row_table_without_headers(parser):
    html = """
    <table>
        <tbody>
            <tr><td>Cell 1</td><td>Cell 2</td></tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_multiple_rows_table_without_headers(parser):
    html = """
    <table>
        <tbody>
            <tr><td>Cell 1</td><td>Cell 2</td></tr>
            <tr><td>Cell 3</td><td>Cell 4</td></tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_single_row_table_with_headers(parser):
    html = """
    <table>
        <thead>
            <tr><th>Header 1</th><th>Header 2</th></tr>
        </thead>
        <tbody>
            <tr><td>Cell 1</td><td>Cell 2</td></tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_multiple_rows_table_with_headers(parser):
    html = """
    <table>
        <thead>
            <tr><th>Header 1</th><th>Header 2</th></tr>
        </thead>
        <tbody>
            <tr><td>Cell 1</td><td>Cell 2</td></tr>
            <tr><td>Cell 3</td><td>Cell 4</td></tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_empty_table(parser):
    html = "<table></table>"
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_table_with_empty_cells(parser):
    html = """
    <table>
        <tbody>
            <tr><td></td><td>Cell 2</td></tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_table_with_mixed_cell_types(parser):
    html = """
    <table>
        <thead>
            <tr><th>Header 1</th><td>Header 2</td></tr>
        </thead>
        <tbody>
            <tr><td>Cell 1</td><th>Cell 2</th></tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_nested_tables(parser):
    html = """
    <table>
        <tbody>
            <tr>
                <td>
                    <table>
                        <tbody>
                            <tr><td>Nested Cell 1</td></tr>
                        </tbody>
                    </table>
                </td>
                <td>Cell 2</td>
            </tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_tables_with_merged_cells(parser):
    html = """
    <table>
        <tbody>
            <tr><td rowspan="2">Merged Cell</td><td>Cell 2</td></tr>
            <tr><td>Cell 3</td></tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_tables_with_various_html_attributes(parser):
    html = """
    <table class="my-table" id="table1" style="width:100%;">
        <tbody>
            <tr><td>Cell 1</td><td>Cell 2</td></tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)


def test_deeply_nested_tables(parser):
    html = """
    <table>
        <tbody>
            <tr>
                <td>
                    <table>
                        <tbody>
                            <tr>
                                <td>
                                    <table>
                                        <tbody>
                                            <tr><td>Deeply Nested Cell</td></tr>
                                        </tbody>
                                    </table>
                                </td>
                            </tr>
                        </tbody>
                    </table>
                </td>
            </tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_table_with_special_characters(parser):
    html = """
    <table>
        <tbody>
            <tr><td>&amp;</td><td>&lt;Cell&gt;</td></tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_table_with_different_encodings(parser):
    html = """
    <table>
        <tbody>
            <tr><td>é</td><td>ü</td></tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_table_with_missing_closing_tags(parser):
    html = """
    <table>
        <tbody>
            <tr><td>Cell 1<td>Cell 2</tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_table_with_incorrect_nesting(parser):
    html = """
    <table>
        <tbody>
            <tr><td>Cell 1</td><td>Cell 2</tr></td>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)

def test_table_with_mixed_content_types(parser):
    html = """
    <table>
        <tbody>
            <tr><td>Text</td><td><img src="image.jpg" alt="Image"></td></tr>
            <tr><td><a href="link.html">Link</a></td><td>More Text</td></tr>
        </tbody>
    </table>
    """
    node = parser(html, 'html.parser')
    processor = AddContentToPage()
    codeflash_output = processor.process_table(node)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

📣 **Feedback**

If you have any feedback or need assistance, feel free to join our Discord community:

Discord

To optimize and improve the speed of the program, we can refactor the `create_block` method to reduce redundancy in the `if-elif` chain. Here's the optimized version of the code.



Changes made.
1. Combined repetitive code blocks for handling `rich_text` and similar attributes into a single `if` statement with an auxiliary set `rich_text_block_types`.
2. Used `update` method for creating table block to improve readability and efficiency.
3. Removed redundant code for assignment of each block type's properties by minimizing the number of `if-elif` statements and better grouping logic. 

This rewriting should reduce redundant operations, thus leading to better performance and readability.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Dec 12, 2024
@codeflash-ai codeflash-ai bot requested a review from misrasaurabh1 December 12, 2024 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants