diff --git a/README.md b/README.md index 864d9df..21c56ed 100644 --- a/README.md +++ b/README.md @@ -4,10 +4,11 @@ Image Analyzer for Home Assistant using GPT-4o. **ha-gpt4vision** creates the `gpt4vision.image_analyzer` service in Home Assistant. -This service sends an image to OpenAI using its API and returns the model's output as a response variable, making it easy to use in automations. +This service sends an image to an AI provider and returns the output as a response variable for easy use in automations. +Supported providers are OpenAI and [LocalAI](https://github.com/mudler/LocalAI). ## Features -- Compatible with both OpenAI's API or [LocalAI](https://github.com/mudler/LocalAI). +- Compatible with both OpenAI's API and [LocalAI](https://github.com/mudler/LocalAI). - Images can be downscaled for faster processing. - Can be installed through HACS and can be set up in the Home Assistant UI. @@ -37,72 +38,19 @@ This service sends an image to OpenAI using its API and returns the model's outp After restarting, the gpt4vision.image_analyzer service will be available. You can test it in the developer tools section in home assistant. To get GPT's analysis of a local image, use the following service call. -```yaml -service: gpt4vision.image_analyzer -data: - message: '[Prompt message for AI]' - model: '[model]' - image_file: '[path for image file]' - target_width: [Target width for image downscaling] - max_tokens: [maximum number of tokens]' -``` -The parameters `message`, `max_tokens` and `image_file` are mandatory for the execution of the service. -Optionally, the `model` and the `target_width` can be set. For available models check this page: https://platform.openai.com/docs/models. - -## Automation Example -In automations, if your response variable name is `response`, you can access the response as `{{response.response_text}}`: -```yaml -sequence: - - service: gpt4vision.image_analyzer - metadata: {} - data: - message: Describe the person in the image - image_file: /config/www/tmp/test.jpg - max_tokens: 100 - response_variable: response - - service: tts.speak - metadata: {} - data: - cache: true - media_player_entity_id: media_player.entity_id - message: "{{response.response_text}}" - target: - entity_id: tts.tts_entity -``` - -## Usage Examples -### Example 1: Announcement for package delivery -If your camera doesn't support built-in delivery announcements, this is likely the easiest way to get them without running an object detection model. - ```yaml service: gpt4vision.image_analyzer data: max_tokens: 100 + message: Describe what you see in this image + image_file: /config/www/tmp/example.jpg + provider: LocalAI model: gpt-4o target_width: 1280 - image_file: '/config/www/tmp/front_porch.jpg' - message: >- - Does it look like the person is delivering a package? Answer with only "yes" - or "no". - # Answer: yes ``` -man delivering package +The parameters `message`, `max_tokens`m `image_file` and `provider` are required. +Optionally, the `model` and `target_width` properties can be set. For available models check these pages: [OpenAI](https://platform.openai.com/docs/models) and [LocalAI](https://localai.io/models/). -### Example 2: Suspicious behaviour -An automation could be triggered if a person is detected around the house when no one is home. -![suspicious behaviour](https://github.com/valentinfrlch/ha-gpt4vision/assets/85313672/411678c4-f344-4eeb-9eb2-b78484a4d872) - -``` -service: gpt4vision.image_analyzer -data: - max_tokens: 100 - model: gpt-4o - target_width: 1280 - image_file: '/config/www/tmp/garage.jpg' - message: >- - What is the person doing? Does anything look suspicious? Answer only with - "yes" or "no". -``` ## Issues > [!NOTE] > **Bugs:** If you encounter any bugs and have read the docs carefully, feel free to file a bug report.