Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI structured outputs support #1307

Open
antoniomdk opened this issue Oct 1, 2024 · 11 comments
Open

OpenAI structured outputs support #1307

antoniomdk opened this issue Oct 1, 2024 · 11 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed question Further information is requested

Comments

@antoniomdk
Copy link
Contributor

Feature Request

I've been working with typia.llm.schema for a while and it has been extremely helpful in generating JSON schemas to call LLMs from TS types. However, the new structured outputs API of OpenAI has some limitations in the type of schemas it can take.

In particular nullable is not been taken into account. So it'd be great if we could map types X | null to anyOf. Maybe introducing a new flag to the typia.llm.schema function.

Also, for types that don't extend from Record, we should mark [additionalProperties to false](https://platform.openai.com/docs/guides/structured-outputs/additionalproperties-false-must-always-be-set-in-objects).

I can contribute to this feature, but I may need some pointers for code references to start.

@samchon samchon added enhancement New feature or request good first issue Good for newcomers labels Oct 1, 2024
@samchon samchon added the help wanted Extra attention is needed label Oct 1, 2024
@samchon
Copy link
Owner

samchon commented Oct 1, 2024

T | null type cannot be oneOf type, because it is the specification of JSON schema (of OpenAPI v3.0) that OpenAI has adopted. Writing T | null type as oneOf type, it is allowed since JSON schema 2020-12 draft version (of OpenAPI v3.1).

By the way, OpenAI understands only understands the anyOf type? Currently, @samchon/openapi and typia are utilizing oneOf type for the TypeScript union type case, because oneOf type has clear meaning than anyOf type.

@samchon samchon removed the enhancement New feature or request label Oct 1, 2024
@samchon
Copy link
Owner

samchon commented Oct 1, 2024

Also, about the additionalProperties to be false, it should be a little bit careful.

The additionalProperties := false means that it does not allow any type of superfluous properties. In the validation rule, if there're any extra property that is not defined in the properties, it must be considered as invalid. It is the additionalProperties := false meaning.

Therefore, if you want to contribute to typia.llm.application<App>() and typia.llm.schema<T>() functions, you have to be careful about the rule.

Here is the code filling the ILlmSchema.IObject.additionalProperties property, and you can accomplish what you want just by changing the return type of the join() function from ILlmSchema | undefined to ILlmSchema | false.

/**
* @internal
*/
const join = (extra: ISuperfluous): ILlmSchema | undefined => {
// LIST UP METADATA
const elements: [Metadata, ILlmSchema][] = Object.values(
extra.patternProperties || {},
);
if (extra.additionalProperties) elements.push(extra.additionalProperties);
// SHORT RETURN
if (elements.length === 0) return undefined;
else if (elements.length === 1) return elements[0]![1]!;
// MERGE METADATA AND GENERATE VULNERABLE SCHEMA
const meta: Metadata = elements
.map((tuple) => tuple[0])
.reduce((x, y) => Metadata.merge(x, y));
return llm_schema_station({
blockNever: true,
attribute: {},
metadata: meta,
});
};

@samchon samchon added the question Further information is requested label Oct 1, 2024
@antoniomdk
Copy link
Contributor Author

I haven't found any info about if OpenAI supporting oneOf, they do mention they support anyOf, but I agree that oneOf should be right type, (doesn't make any sense for a type to be null and not null at the same time). That's why I was suggesting putting this behavior changes under a flag or making the user explicitly ask for that, because it deviates from OpenAPI & JSON schema standards.

@antoniomdk
Copy link
Contributor Author

antoniomdk commented Oct 1, 2024

@samchon
Copy link
Owner

samchon commented Oct 1, 2024

How about the other models?

In the Google Gemini case, it is using the OpenAPI v3.0.3 specified JSON schema, but not supporting oneOf.

OpenAI, it sometimes looks like using OpenAPI v3.1, and sometimes v3.0. It supports mixed-in types embodied by type: ["string", "null"], but not supporting tuple type embodied by { type: "array", prefixItems: [A, B, C] }. I need to study and test OpenAI deeply at next weekend.

@samchon
Copy link
Owner

samchon commented Oct 1, 2024

To support LLM function calling feature exactly, I should separate the providers like below.

  • Top level namespaces
    • typia.openai.application<App>(): ILlmApplication<IOpenAiSchema>
    • typia.gemini.application<App>(): ILlmApplication<IGeminiSchema>
    • typia.llama.application<App>(): ILlmApplication<ILlamaSchema>
  • Nested namespaces
    • typia.llm.openai.application<App>()
    • typia.llm.gemini.application<App>()
    • typia.llm.llama.application<App>()
  • Generic Argument
    • typia.llm.application<App, "openai">()
    • typia.llm.application<App, "gemini">()
    • typia.llm.application<App, "llama">()

@samchon
Copy link
Owner

samchon commented Oct 2, 2024

@antoniomdk If you send an PR about additionalProperties, I'll accept it.

Also, about the manipulating specific LLM provider's schema, I'll prepare the major update.

It would be @samchon/openapi@2.0.0 and typia@7.0.0.

@antoniomdk
Copy link
Contributor Author

@samchon That sounds great! I think the LLM-specific separation makes a lot of sense. I'll send a PR for additionalProperties by EOW (probably during the weekend).

@bradleat
Copy link

Related to LLM structured outputs, I find that when prompting I often want to use the jsdoc comment for a type in the prompt. Can typia add a misc method for returning the jsdoc string of a particular type.

Using typia.reflect.metadata can get you this information, but it'd be nice to just get the jsdoc comment.

@samchon
Copy link
Owner

samchon commented Nov 15, 2024

@antoniomdk, @bradleat https://github.com/samchon/openapi/blob/v2.0/src/structures/IChatGptSchema.ts

I'm preparing the OpenAI dedicated schema type as IChatGptSchema in the next version of @samchon/openapi and typia.

Here is the type, and I'll test it by using the ChatGPT API, and considering below things.

  • Whether to adapt $ref type to every name schemas, or just only for the recursive types
  • Whether to just use oneOf type and its discriminator property for clear union type predication
  • Whether to use const type or enum property
    • OpenAI's document supports JSON schema v7 specification (OpenApi.IJsonSchema)
    • However, example of OpenAI shows that only using anyOf
    • Also, const is clear that enum, but example is just utilizing the enum

If you want to experience it earlier, install typia@next version, and call the typia.llm.application<App, "chatgpt">().

npm install typia@next

@samchon samchon added the enhancement New feature or request label Nov 15, 2024
@samchon
Copy link
Owner

samchon commented Nov 15, 2024

Here is an example of the currently considering IChatSchema's use case.

Source Code

import {
  ChatGptTypeChecker,
  IChatGptSchema,
  ILlmApplication,
} from "@samchon/openapi";
import typia, { tags } from "typia";

const app: ILlmApplication<"chatgpt"> = typia.llm.application<
  BbsArticleController,
  "chatgpt"
>({
  separate: (schema: IChatGptSchema) =>
    ChatGptTypeChecker.isString(schema) &&
    schema.contentMediaType !== undefined,
});
console.log(app);

interface BbsArticleController {
  /**
   * Create a new article.
   *
   * Writes a new article and archives it into the DB.
   *
   * @param input Information of the article to create
   * @returns Newly created article
   */
  create(input: IBbsArticle.ICreate): Promise<IBbsArticle>;
 
  /**
   * Update an article.
   *
   * Updates an article with new content.
   *
   * @param id Target article's {@link IBbsArticle.id}
   * @param input New content to update
   */
  update(
    id: string & tags.Format<"uuid">,
    input: IBbsArticle.IUpdate,
  ): Promise<void>;
 
  /**
   * Erase an article.
   *
   * Erases an article from the DB.
   *
   * @param id Target article's {@link IBbsArticle.id}
   */
  erase(id: string & tags.Format<"uuid">): Promise<void>;
}
 
/**
 * Article entity.
 *
 * `IBbsArticle` is an entity representing an article in the BBS (Bulletin Board System).
 */
interface IBbsArticle extends IBbsArticle.ICreate {
  /**
   * Primary Key.
   */
  id: string & tags.Format<"uuid">;
 
  /**
   * Creation time of the article.
   */
  created_at: string & tags.Format<"date-time">;
 
  /**
   * Last updated time of the article.
   */
  updated_at: string & tags.Format<"date-time">;
}
namespace IBbsArticle {
  /**
   * Information of the article to create.
   */
  export interface ICreate {
    /**
     * Title of the article.
     *
     * Representative title of the article.
     */
    title: string;
 
    /**
     * Content body.
     *
     * Content body of the article writtn in the markdown format.
     */
    body: string;
 
    /**
     * Thumbnail image URI.
     *
     * Thumbnail image URI which can represent the article.
     *
     * If configured as `null`, it means that no thumbnail image in the article.
     */
    thumbnail:
      | null
      | (string & tags.Format<"uri"> & tags.ContentMediaType<"image/*">);
  }
 
  /**
   * Information of the article to update.
   *
   * Only the filled properties will be updated.
   */
  export type IUpdate = Partial<ICreate>;
}

Compiled Code

import * as __typia_transform__llmApplicationFinalize from "typia/lib/internal/_llmApplicationFinalize.js";
import { ChatGptTypeChecker } from "@samchon/openapi";
import typia from "typia";
const app = (() => {
  const app = {
    model: "chatgpt",
    functions: [
      {
        name: "create",
        parameters: [
          {
            $ref: "#/$defs/IBbsArticle.ICreate",
            description: "Information of the article to create",
            $defs: {
              "IBbsArticle.ICreate": {
                type: "object",
                properties: {
                  title: {
                    type: "string",
                    title: "Title of the article",
                    description:
                      "Title of the article.\n\nRepresentative title of the article.",
                  },
                  body: {
                    type: "string",
                    title: "Content body",
                    description:
                      "Content body.\n\nContent body of the article writtn in the markdown format.",
                  },
                  thumbnail: {
                    oneOf: [
                      {
                        type: "null",
                      },
                      {
                        type: "string",
                        format: "uri",
                        contentMediaType: "image/*",
                      },
                    ],
                    title: "Thumbnail image URI",
                    description:
                      "Thumbnail image URI.\n\nThumbnail image URI which can represent the article.\n\nIf configured as `null`, it means that no thumbnail image in the article.",
                  },
                },
                required: ["title", "body", "thumbnail"],
                description: "Information of the article to create.",
                additionalProperties: false,
              },
            },
          },
        ],
        output: {
          $ref: "#/$defs/IBbsArticle",
          description: "Newly created article",
          $defs: {
            IBbsArticle: {
              type: "object",
              properties: {
                id: {
                  type: "string",
                  format: "uuid",
                  title: "Primary Key",
                  description: "Primary Key.",
                },
                created_at: {
                  type: "string",
                  format: "date-time",
                  title: "Creation time of the article",
                  description: "Creation time of the article.",
                },
                updated_at: {
                  type: "string",
                  format: "date-time",
                  title: "Last updated time of the article",
                  description: "Last updated time of the article.",
                },
                title: {
                  type: "string",
                  title: "Title of the article",
                  description:
                    "Title of the article.\n\nRepresentative title of the article.",
                },
                body: {
                  type: "string",
                  title: "Content body",
                  description:
                    "Content body.\n\nContent body of the article writtn in the markdown format.",
                },
                thumbnail: {
                  oneOf: [
                    {
                      type: "null",
                    },
                    {
                      type: "string",
                      format: "uri",
                      contentMediaType: "image/*",
                    },
                  ],
                  title: "Thumbnail image URI",
                  description:
                    "Thumbnail image URI.\n\nThumbnail image URI which can represent the article.\n\nIf configured as `null`, it means that no thumbnail image in the article.",
                },
              },
              required: [
                "id",
                "created_at",
                "updated_at",
                "title",
                "body",
                "thumbnail",
              ],
              description:
                "Article entity.\n\n`IBbsArticle` is an entity representing an article in the BBS (Bulletin Board System).",
              additionalProperties: false,
            },
          },
        },
        description:
          "Create a new article.\n\nWrites a new article and archives it into the DB.",
      },
      {
        name: "update",
        parameters: [
          {
            type: "string",
            format: "uuid",
            description: "Target article's ",
          },
          {
            $ref: "#/$defs/PartialIBbsArticle.ICreate",
            description: "New content to update",
            $defs: {
              "PartialIBbsArticle.ICreate": {
                type: "object",
                properties: {
                  title: {
                    type: "string",
                    title: "Title of the article",
                    description:
                      "Title of the article.\n\nRepresentative title of the article.",
                  },
                  body: {
                    type: "string",
                    title: "Content body",
                    description:
                      "Content body.\n\nContent body of the article writtn in the markdown format.",
                  },
                  thumbnail: {
                    oneOf: [
                      {
                        type: "null",
                      },
                      {
                        type: "string",
                        format: "uri",
                        contentMediaType: "image/*",
                      },
                    ],
                    title: "Thumbnail image URI",
                    description:
                      "Thumbnail image URI.\n\nThumbnail image URI which can represent the article.\n\nIf configured as `null`, it means that no thumbnail image in the article.",
                  },
                },
                description: "Make all properties in T optional",
                additionalProperties: false,
              },
            },
          },
        ],
        description:
          "Update an article.\n\nUpdates an article with new content.",
      },
      {
        name: "erase",
        parameters: [
          {
            type: "string",
            format: "uuid",
            description: "Target article's ",
          },
        ],
        description: "Erase an article.\n\nErases an article from the DB.",
      },
    ],
    options: {
      separate: null,
    },
  };
  __typia_transform__llmApplicationFinalize._llmApplicationFinalize(app, {
    separate: (schema) =>
      ChatGptTypeChecker.isString(schema) &&
      schema.contentMediaType !== undefined,
  });
  return app;
})();
console.log(app);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed question Further information is requested
Projects
Status: No status
Status: No status
Status: To do
Development

No branches or pull requests

3 participants