Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Issue: Taking excessive time to convert Blocks to Markdown #107

Open
tictaqqn opened this issue May 9, 2024 · 1 comment
Open

Comments

@tictaqqn
Copy link

tictaqqn commented May 9, 2024

Description

I've encountered a performance issue with the notion-to-md package where the conversion of Notion pages to Markdown takes an excessively long time.

Steps to Reproduce

Run below. saveBlocksToMarkdown ends within 5 seconds, but saveBlocksToFile takes more than 10 minutes.

import 'dotenv/config'
import { promises as fs } from 'fs';

import { Client } from '@notionhq/client';
import { NotionToMarkdown } from 'notion-to-md';

// 環境変数からNotion APIのトークンとページIDを取得
const notionToken = process.env.NOTION_TOKEN;
const pageId = process.env.PAGE_ID ?? '';  // UUID形式のページID

const notion = new Client({
    auth: notionToken
});

const n2m = new NotionToMarkdown({ notionClient: notion });

async function fetchAllBlocks(blockId: string, startCursor?: string) {
    let blocks: unknown[] = [];
    let hasMore = true;
    let cursor = startCursor;

    while (hasMore) {
        const response = await notion.blocks.children.list({
            block_id: blockId,
            start_cursor: cursor,
            page_size: 100
        });
        blocks = blocks.concat(response.results);
        hasMore = response.has_more;
        cursor = response.next_cursor ?? undefined;
    }

    return blocks;
}

async function saveBlocksToMarkdown(blocks: unknown[], filename: string) {
  // eslint-disable-next-line @typescript-eslint/no-explicit-any
  const markdown = await n2m.blocksToMarkdown(blocks as any[])
  await fs.writeFile(filename, markdown, 'utf-8');
  console.log(`Markdown saved to ${filename}`);
}

async function saveBlocksToFile(blocks: unknown[], filename: string) {
    const jsonContent = JSON.stringify(blocks, null, 2);
    await fs.writeFile(filename, jsonContent, 'utf-8');
    console.log(`Blocks saved to ${filename}`);
}

async function run() {
    try {
        const blocks = await fetchAllBlocks(pageId);
        await saveBlocksToFile(blocks, './outputs/notion_blocks.json');
        await saveBlocksToMarkdown(blocks, './outputs/notion_blocks.md')
    } catch (error) {
        console.error('Error retrieving or saving Notion blocks:', error);
    }
}

run();

Expected Behavior

The conversion process should complete in a reasonable amount of time, proportional to the complexity and size of the Notion page being converted.

Actual Behavior

The conversion process takes an unusually long time, far exceeding reasonable expectations with about 10,000 blocks.

Possible Solution

I am not sure what might be causing this issue, but it might be related to how data is fetched or processed during the conversion. A review of the fetching and parsing mechanisms might be needed.

Additional Context

  • Node.js version: 20.12.0
  • notion-to-md version: 3.1.1
@souvikinator
Copy link
Owner

@tictaqqn apologies for the super late response. I'm working on the version 4 and it focuses on fixing this issue as well, would like to have your feedback on it.
here is the discussion link: #112

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants