Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some characters are sometimes incorrectly converted from Markdown to Org #451

Open
1 task done
benthamite opened this issue Nov 1, 2024 · 5 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@benthamite
Copy link
Contributor

benthamite commented Nov 1, 2024

Please update gptel first -- errors are often fixed by the time they're reported.

  • I have updated gptel to the latest commit and tested that the issue still exists

Bug Description

This is just another instance of the behavior reported here: #81. @karthink asked to create a new issue to report new cases, hence this issue.

In this case, the issue arises when converting Claude responses in JavaScript code blocks:


:PROPERTIES:
:GPTEL_MODEL: claude-3-5-sonnet-20241022
:GPTEL_BACKEND: Claude
:GPTEL_SYSTEM: You are a large language model living in Emacs and a helpful assistant. Respond concisely.
:GPTEL_BOUNDS: ((1398 . 4372) (4373 . 4428) (4429 . 4563) (4564 . 4600) (4601 . 4633) (4634 . 4722) (4723 . 4746) (4748 . 4813) (4814 . 4835) (4836 . 4862) (4863 . 4910) (4911 . 4957) (4958 . 5809) (5957 . 7290) (7883 . 12429) (13398 . 14398) (14922 . 16218) (16796 . 17682) (18858 . 20140) (20626 . 24145) (24283 . 26588) (26842 . 30118) (38317 . 40674) (47408 . 51035) (57399 . 60437) (60785 . 60932) (61097 . 62584) (66370 . 69409) (80001 . 83666) (84242 . 86193) (86750 . 88789) (89581 . 91572) (91853 . 92668) (92669 . 92690) (92691 . 93125) (93126 . 93187) (93188 . 93803) (111393 . 115189) (115360 . 117420))
:END:

<prior discussion omitted>

I see the issue - there are two problems:

1. We're getting token ID "0" which shouldn't happen. We need to fix the token ID lookup logic.
2. The market update is being blocked by a lock that isn't being released properly.

Let's fix both issues:

1. First, modify the lock handling in =locks.js= to add a timeout for stale locks:

#+begin_src javascript
export async function acquireLock(
    lockName,
    debouncePeriod = 30 * 1000,  // 30 seconds
    staleLockTimeout = 15 * 60 * 1000  // 15 minutes
) {
    let result;
    const trx = await getTransaction();
    const lockRow = getFirstRow(
        await trx.query({
            text: "select is_locked, last_updated from process_lock where id = $1",
            values: [lockName],
        }),
    );
    if (lockRow?.lock_state !== undefined) {
        const timeElapsed = new Date().getTime() - new Date(lockRow.last_updated).getTime();

        if (timeElapsed < debouncePeriod) {
            return =${lockName} debounced=;
        }

        if (timeElapsed > staleLockTimeout && lockRow?.is_locked === true) {
            await trx.query({
                text: =update process_lock set is_locked = $1 where id = $2=,
                values: [false, lockName],
            });
            lockRow.is_locked = false;
            console.log(=Stale lock found: ignored and updated.=);
        }
    }

    if (lockRow?.is_locked === true) {
        return =${lockName} cannot be acquired: operation already in progress.=;
    }

    await trx.query({
        text: =
                insert into process_lock (id, is_locked) values ($1, $2) 
                on conflict (id) do update set is_locked = $2=,
        values: [lockName, true],
    });
    await trx.commit();
    result = true;
    return result;
}
#+end_src

2. Then modify the token lookup in =sync-service.js=:

#+begin_src javascript
async function syncLogs() {
    // ... existing code ...
    
    for (const log of logs) {
        await writeLog(log);

        const isBuy = log.decodedLog.makerAssetId === "0";
        const tokenId = isBuy 
            ? log.decodedLog.takerAssetId 
            : log.decodedLog.makerAssetId;
            
        if (tokenId === "0") {
            console.log("Invalid token ID (0) detected, skipping...");
            continue;
        }

        const tradeAmount = isBuy
            ? Number(log.decodedLog.makerAmountFilled) / 1000000
            : Number(log.decodedLog.takerAmountFilled) / 1000000;

        if (shouldAlertTrade(tradeAmount)) {
            let token = await getToken(tokenId);

            // If market is unknown, try updating markets
            if (!token || token.question === "Unknown market") {
                console.log("Unknown market detected, updating market list...");
                try {
                    await updatePolymarketMarkets(true);
                    // Try getting token info again
                    token = await getToken(tokenId);
                } catch (error) {
                    console.log("Could not update markets:", error.message);
                }
            }

            const alertMessage = formatTradeAlert({
                username: user,
                operation: isBuy ? "Buy" : "Sell",
                amount: tradeAmount,
                price: token?.price || 0,
                question: token?.question || "Unknown market",
                outcome: token?.outcome || "Unknown outcome",
                blockNumber: log.blockNumber
            });

            await sendTelegramAlert(alertMessage);
        }
    }
    // ... rest of the code ...
}
#+end_src


*** Error:

#+begin_src shell
node sync-service.js
file:///Users/pablostafforini/Library/CloudStorage/Dropbox/repos/polymarket-bot/database/locks.js:20
            return =${lockName} debounced=;
                   ^

SyntaxError: Unexpected token '='
    at ModuleLoader.moduleStrategy (node:internal/modules/esm/translators:169:18)
    at callTranslator (node:internal/modules/esm/loader:272:14)
    at ModuleLoader.moduleProvider (node:internal/modules/esm/loader:278:30)

Node.js v20.14.0

#+end_src


Sorry about the template literal syntax error. Here's the corrected version of =locks.js=:

#+begin_src javascript
import { Db, getFirstRow, getTransaction } from "./pg-connection.js";

export async function acquireLock(
    lockName,
    debouncePeriod = 30 * 1000,  // 30 seconds
    staleLockTimeout = 15 * 60 * 1000  // 15 minutes
) {
    let result;
    const trx = await getTransaction();
    const lockRow = getFirstRow(
        await trx.query({
            text: "select is_locked, last_updated from process_lock where id = $1",
            values: [lockName],
        }),
    );
    if (lockRow?.lock_state !== undefined) {
        const timeElapsed = new Date().getTime() - new Date(lockRow.last_updated).getTime();

        if (timeElapsed < debouncePeriod) {
            return =${lockName} debounced=;
        }

        if (timeElapsed > staleLockTimeout && lockRow?.is_locked === true) {
            await trx.query({
                text: =update process_lock set is_locked = $1 where id = $2=,
                values: [false, lockName],
            });
            lockRow.is_locked = false;
            console.log("Stale lock found: ignored and updated.");
        }
    }

    if (lockRow?.is_locked === true) {
        return =${lockName} cannot be acquired: operation already in progress.=;
    }

    await trx.query({
        text: =
            insert into process_lock (id, is_locked) values ($1, $2) 
            on conflict (id) do update set is_locked = $2=,
        values: [lockName, true],
    });
    await trx.commit();
    result = true;
    return result;
}

export async function releaseLock(lockName) {
    const db = new Db();
    return db.query({
        text: =update process_lock set is_locked = $1 where id = $2=,
        values: [false, lockName],
    });
}
#+end_src

I replaced the incorrect syntax (=...=) with proper template literals (=...=). Try running the script again with this corrected version.

*** It seems my editor is converting your backticks into equal signs. Hence the issue. Can you use a different symbol for the backticks, so I can then search for those symbols and replace them with backticks?

Yes, let's use '|' (pipe) to represent backticks. Here's the code - you'll need to replace all '|' with backticks:

#+begin_src javascript
import { Db, getFirstRow, getTransaction } from "./pg-connection.js";

export async function acquireLock(
    lockName,
    debouncePeriod = 30 * 1000,  // 30 seconds
    staleLockTimeout = 15 * 60 * 1000  // 15 minutes
) {
    let result;
    const trx = await getTransaction();
    const lockRow = getFirstRow(
        await trx.query({
            text: "select is_locked, last_updated from process_lock where id = $1",
            values: [lockName],
        }),
    );
    if (lockRow?.lock_state !== undefined) {
        const timeElapsed = new Date().getTime() - new Date(lockRow.last_updated).getTime();

        if (timeElapsed < debouncePeriod) {
            return `${lockName} debounced`;
        }

        if (timeElapsed > staleLockTimeout && lockRow?.is_locked === true) {
            await trx.query({
                text: "update process_lock set is_locked = $1 where id = $2",
                values: [false, lockName],
            });
            lockRow.is_locked = false;
            console.log("Stale lock found: ignored and updated.");
        }
    }

    if (lockRow?.is_locked === true) {
        return `${lockName} cannot be acquired: operation already in progress`;
    }

    await trx.query({
        text: "insert into process_lock (id, is_locked) values ($1, $2) on conflict (id) do update set is_locked = $2",
        values: [lockName, true],
    });
    await trx.commit();
    result = true;
    return result;
}

export async function releaseLock(lockName) {
    const db = new Db();
    return db.query({
        text: "update process_lock set is_locked = $1 where id = $2",
        values: [false, lockName],
    });
}
#+end_src

Replace all | characters with backticks (=) in your editor. The backtick key is usually located to the left of the 1 key on your keyboard.

Backend

Anthropic

Steps to Reproduce

See quoted org excerpt above.

Additional Context

GNU Emacs 30.0.91 (build 1, aarch64-apple-darwin24.0.0, NS appkit-2566.00 Version 15.0.1 (Build 24A348)) of 2024-10
macOS Sequoia

Backtrace

No response

Log Information

No response

@benthamite benthamite added the bug Something isn't working label Nov 1, 2024
@karthink
Copy link
Owner

karthink commented Nov 1, 2024

Thanks. Besides the conversion of backquote ` characters to = inside source blocks, like in the following:

return =${lockName} debounced=;

return =${lockName} cannot be acquired: operation already in progress.=;

is there anything else wrong?

To find the parser bug, it will help to have the response log to compare the Org output with. Could you set gptel-log-level to 'info, then produce the error and provide both the Org output and the JSON stream? You can use a smaller example than the above.

EDIT: Never mind the log, I was able to reproduce the bug with a simple prompt.

@karthink
Copy link
Owner

karthink commented Nov 1, 2024

Second edit: I need that log after all -- the bug I produced is different from the one you're experiencing.

@gzmask
Copy link

gzmask commented Dec 19, 2024

I am experiencing the same bug, oddly only in claude, and gpt4 seems to be fine:

Steps to reproduce:

  1. Start conversation with Claude

  2. Tell Claude to use text mode output

    • Ask Claude to type backtick (`)
    • EXPECTED: Shows backtick (`)
    • ACTUAL: Shows backtick (`) correctly
  3. Tell Claude to switch to org-mode output

    • Ask Claude to type backtick (`)
    • EXPECTED: Should show backtick (`)
    • ACTUAL: Shows equals sign (=)

@karthink
Copy link
Owner

karthink commented Dec 19, 2024 via email

@gzmask
Copy link

gzmask commented Dec 19, 2024

I am experiencing the same bug, oddly only in claude, and gpt4 seems to be fine: Steps to reproduce: 1. Start conversation with Claude 2. Tell Claude to use text mode output - Ask Claude to type backtick () - EXPECTED: Shows backtick () - ACTUAL: Shows backtick () correctly 3. Tell Claude to switch to org-mode output - Ask Claude to type backtick () - EXPECTED: Should show backtick (`) - ACTUAL: Shows equals sign (=)
In Org mode buffers, gptel assumes the LLM output is in markdown and converts it to Org mode internally. If the LLM output is already in Org markup, things will expectedly go wrong. Possible solutions are 1. Ask the LLM to output in Markdown always. This is what I do. 2. Expose a switch in the transient menu (and via elisp configuration) so the user can turn off automatic conversion to Org as required. 3. Automatically detect if the output is already in Org and turn off the conversion. Doing 3 for streaming responses is very hard (without holding up the stream) and out of scope for gptel. It can be implemented for non-streaming responses. Let's call this solution 2.5. What do you think?

thanks. I tried to ask claude to output markdown, but it doesn't seems to fix the problem. if there's a way to turn off the automatic conversion it'd be nice.

I tried changing the function gptel--convert-org and reevaluate it to skip conversion but still no effect. I wonder where the actually conversion is happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants