Skip to content
This repository has been archived by the owner on Oct 14, 2024. It is now read-only.

Commit

Permalink
Merge pull request #311 from janhq/0.6.0-docs
Browse files Browse the repository at this point in the history
Update docs for the newest version
  • Loading branch information
urmauur authored Aug 29, 2024
2 parents d7b1a21 + 68f13aa commit 1443c26
Show file tree
Hide file tree
Showing 35 changed files with 504 additions and 608 deletions.
Binary file added src/pages/docs/_assets/assistant-slider.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/pages/docs/_assets/model-management1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/pages/docs/_assets/model-management2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/pages/docs/_assets/preserve.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/pages/docs/_assets/quick-ask.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/pages/docs/_assets/spell.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/pages/docs/_assets/vulkan.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion src/pages/docs/_meta.json
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,10 @@
},
"models": "Models",
"tools": "Tools",
"assistants": "Assistants",
"threads": "Threads",
"settings": "Settings",
"shortcuts": "Keyboard Shortcuts",
"local-api": "",
"inference-engines": {
"title": "MODEL PROVIDER",
"type": "separator"
Expand Down
34 changes: 34 additions & 0 deletions src/pages/docs/assistants.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
title: Assistants
description: A step-by-step guide on customizing your assistant.
keywords:
[
Jan,
Customizable Intelligence, LLM,
local AI,
privacy focus,
free and open source,
private and offline,
conversational AI,
no-subscription fee,
large language models,
manage assistants,
assistants,
]
---

import { Callout, Steps } from 'nextra/components'

# Assistants
This guide explains how to set the Assistant instructions in the Jan application.

## Applied the Instructions to All Threads
To apply the instructions to all the new threads, follow these steps:
1. Select a **Thread**.
2. Click the **Assistant** tab.
3. Toggle the **slider** to ensure these instructions are applied to all new threads. (Activate the **Experimental Mode** feature to enable this option.)
<br/>

![Assistant Slider](./_assets/assistant-slider.png)

<br/>
10 changes: 10 additions & 0 deletions src/pages/docs/built-in/_meta.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"llama-cpp": {
"title": "llama.cpp",
"href": "/docs/built-in/llama-cpp"
},
"tensorrt-llm": {
"title": "TensorRT-LLM",
"href": "/docs/built-in/tensorrt-llm"
}
}
69 changes: 19 additions & 50 deletions src/pages/docs/built-in/llama-cpp.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: llama.cpp
description: A step-by-step guide on how to customize the llama.cpp extension.
description: A step-by-step guide on how to customize the llama.cpp engine.
keywords:
[
Jan,
Expand All @@ -13,7 +13,7 @@ keywords:
no-subscription fee,
large language models,
Llama CPP integration,
llama.cpp Extension,
llama.cpp Engine,
Intel CPU,
AMD CPU,
NVIDIA GPU,
Expand All @@ -30,59 +30,28 @@ import { Callout, Steps } from 'nextra/components'

## Overview

Jan has a default [C++ inference server](https://github.com/janhq/nitro) built on top of [llama.cpp](https://github.com/ggerganov/llama.cpp). This server provides an OpenAI-compatible API, queues, scaling, and additional features on top of the wide capabilities of `llama.cpp`.
Jan has a default [C++ inference server](https://github.com/janhq/cortex) built on top of [llama.cpp](https://github.com/ggerganov/llama.cpp). This server provides an OpenAI-compatible API, queues, scaling, and additional features on top of the wide capabilities of `llama.cpp`.

## llama.cpp Extension
## llama.cpp Engine

This guide shows you how to modify your engine's behavior by adjusting its settings in the `model.json` file.
This guide shows you how to initialize the `llama.cpp` to download and install the required dependencies to start chatting with a model using the `llama.cpp` engine.

## Prerequisites
<Tabs items={['Mac', 'Windows', 'Linux']}>
<Tabs.Tab >
<Tabs items={['Mac Intel', 'Mac Silicon']}>
<Tabs.Tab >
- Make sure you're using an Intel-based Mac. For a complete list of supported Intel CPUs, please see [here](https://en.wikipedia.org/wiki/MacBook_Pro_(Intel-based)).
- For Mac Intel, it is recommended to utilize smaller models.
<Callout type="info">
This uses CPU by default, and no acceleration option is available.
</Callout>

</Tabs.Tab>

<Tabs.Tab >
- Make sure you're using a Mac Silicon. For a complete list of supported Apple Silicon CPUs, please see [here](https://en.wikipedia.org/wiki/Apple_Silicon).
- Using an adequate model size based on your hardware is recommended for Mac Silicon.
<Callout type="info">
This can use Apple GPU with Metal by default for acceleration. Apple ANE is not supported yet.
</Callout>
</Tabs.Tab>

</Tabs>
</Tabs.Tab>
<Tabs.Tab >
- Ensure that you have **Windows with x86_64** architecture.
- Select a model size suited to your Windows hardware by looking for the `Recommended RAM` tag in the Hub.

<Callout type="info">
- It will use CPU by default if you do not have any GPU/ NPU.
- Ensure that you have also installed the correct CPU instruction.
</Callout>


</Tabs.Tab>
<Tabs.Tab >
- Ensure that you have **Linux with x86_64** architecture.
- Select a model size suited to your Linux hardware based on the `Recommended` tag for `RAM` in the Hub.

<Callout type="info">
- It will use CPU by default if you do not have any GPU/ NPU.
- Ensure that you have also installed the correct CPU instruction.
</Callout>

</Tabs.Tab>
</Tabs>
- Mac Intel:
- Make sure you're using an Intel-based Mac. For a complete list of supported Intel CPUs, please see [here](https://en.wikipedia.org/wiki/MacBook_Pro_(Intel-based)).
- For Mac Intel, it is recommended to utilize smaller models.
- Mac Sillicon:
- Make sure you're using a Mac Silicon. For a complete list of supported Apple Silicon CPUs, please see [here](https://en.wikipedia.org/wiki/Apple_Silicon).
- Using an adequate model size based on your hardware is recommended for Mac Silicon.
<Callout type="info">
This can use Apple GPU with Metal by default for acceleration. Apple ANE is not supported yet.
</Callout>
- Windows:
- Ensure that you have **Windows with x86_64** architecture.
- Linux:
- Ensure that you have **Linux with x86_64** architecture.

### GPU Acceleration Options
#### GPU Acceleration Options
Enable the GPU acceleration option within the Jan application by following the [Installation Setup](/docs/desktop-installation) guide.
## Step-by-step Guide
<Steps>
Expand Down
42 changes: 10 additions & 32 deletions src/pages/docs/built-in/tensorrt-llm.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: TensorRT-LLM
description: A step-by-step guide on customizing the TensorRT-LLM extension.
description: A step-by-step guide on customizing the TensorRT-LLM engine.
keywords:
[
Jan,
Expand All @@ -12,10 +12,10 @@ keywords:
conversational AI,
no-subscription fee,
large language models,
TensorRT-LLM Extension,
TensorRT-LLM Engine,
TensorRT,
tensorRT,
extension,
engine,
]
---

Expand All @@ -25,29 +25,20 @@ import { Callout, Steps } from 'nextra/components'

## Overview

This guide walks you through installing Jan's official [TensorRT-LLM Extension](https://github.com/janhq/nitro-tensorrt-llm). This extension uses [Nitro-TensorRT-LLM](https://github.com/janhq/nitro-tensorrt-llm) as the AI engine instead of the default [Nitro-Llama-CPP](https://github.com/janhq/nitro). It includes an efficient C++ server that executes the [TRT-LLM C++ runtime](https://nvidia.github.io/TensorRT-LLM/gpt_runtime.html) natively. It also includes features and performance improvements like OpenAI compatibility, tokenizer improvements, and queues.
This guide walks you through installing Jan's official [TensorRT-LLM Engine](https://github.com/janhq/nitro-tensorrt-llm). This engine uses [Cortex-TensorRT-LLM](https://github.com/janhq/cortex.tensorrt-llm) as the AI engine instead of the default [Cortex-Llama-CPP](https://github.com/janhq/cortex). It includes an efficient C++ server that executes the [TRT-LLM C++ runtime](https://nvidia.github.io/TensorRT-LLM/gpt_runtime.html) natively. It also includes features and performance improvements like OpenAI compatibility, tokenizer improvements, and queues.

<Callout type='warning' emoji="">
- This feature is only available for Windows users. Linux is coming soon.

- Additionally, we only prebuilt a few demo models. You can always build your desired models directly on your machine. For more information, please see [here](#build-your-own-tensorrt-models).
This feature is only available for Windows users. Linux is coming soon.
</Callout>

### Pre-requisites

- A Windows PC
- Nvidia GPU(s): Ada or Ampere series (i.e. RTX 4000s & 3000s). More will be supported soon.
- 3GB+ of disk space to download TRT-LLM artifacts and a Nitro binary
- Jan v0.4.9+ or Jan v0.4.8-321+ (nightly)
- Nvidia Driver v535+ (For installation guide, please see [here](/docs/troubleshooting#1-ensure-gpu-mode-requirements))
- CUDA Toolkit v12.2+ (For installation guide, please see [here](/docs/troubleshooting#1-ensure-gpu-mode-requirements))
- A **Windows** PC.
- **Nvidia GPU(s)**: Ada or Ampere series (i.e. RTX 4000s & 3000s). More will be supported soon.
- Sufficient disk space for the TensorRT-LLM models and data files (space requirements vary depending on the model size).

<Callout type='warning'>
If you are using our nightly builds, you may have to reinstall the TensorRT-LLM extension each time you update the app. We're working on better extension lifecycles - stay tuned.
</Callout>

<Steps>

### Step 1: Install TensorRT-Extension

1. Click the **Gear Icon (⚙️)** on the bottom left of your screen.
Expand All @@ -65,7 +56,7 @@ This guide walks you through installing Jan's official [TensorRT-LLM Extension](
3. Check that files are correctly downloaded.

```bash
ls ~/jan/extensions/@janhq/tensorrt-llm-extension/dist/bin
ls ~/jan/data/extensions/@janhq/tensorrt-llm-extension/dist/bin
# Your Extension Folder should now include `nitro.exe`, among other artifacts needed to run TRT-LLM
```

Expand Down Expand Up @@ -97,17 +88,4 @@ We offer a handful of precompiled models for Ampere and Ada cards that you can i
<br/>
![Specific Conversation](../_assets/model-parameters.png)

</Steps>

## Build your own TensorRT Engine
To create custom TensorRT engines, you can follow the step-by-step guide provided in the [NVIDIA documentation](https://nvidia.github.io/TensorRT-LLM/quick-start-guide.html#compile-the-model-into-a-tensorrt-engine).

When compiling and running these models, please adhere to the following compatibility guidelines:
- **GPU Architectures**: Models are specifically compiled for certain GPU architectures, such as Ada. Ensure that your model is compatible with the architecture of the GPU on which it will run.
- **TensorRT-LLM Release**: Models need to be compiled and run on the same version of the TensorRT-LLM. For example, a model compiled with version 0.9.0 must be run on version 0.9.0.
- **Operating System Compatibility**: As of version 0.9.0, models are designed to be cross-OS compatible. However, this feature is still under evaluation and might exhibit instability.
- **GPU Topology**: It is crucial to understand your system's GPU topology, especially when dealing with multiple GPUs. This can be determined by the number of engine files in use.

<Callout type='info'>
Ensure these parameters are aligned correctly to avoid runtime issues and fully leverage TensorRT engines' capabilities.
</Callout>
</Steps>
40 changes: 35 additions & 5 deletions src/pages/docs/data-folder.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,32 @@ import { Callout, Steps } from 'nextra/components'
# Jan Data Folder
Jan stores your data locally in your own filesystem in a universal file format (JSON). We build for privacy by default and do not collect or sell your data.

This guide helps you understand where and how this data is stored. We'll also show you how to delete or move the data folder location.
This guide helps you understand where and how this data is stored.

## Open the Data Folder

To open the Jan data folder from the app:
1. Click the System monitor button on your Jan app.
2. Click the App Log button.
3. This redirects you to the Jan data folder.

```bash
# Windows
~/AppData/Roaming/Jan/data

# Mac
~/Library/Application\ Support/Jan/data

# Linux
## Custom installation directory
$XDG_CONFIG_HOME = /home/username/custom_config

or

## Default installation directory
~/.config/Jan/data

```

## Folder Structure
Jan app data folder should have the following folder structure:
Expand All @@ -36,8 +61,6 @@ Jan is stored in the root `~/jan` by default.
/assistants
/jan
assistant.json
/shakespeare
assistant.json
/extensions
extensions.json
/@janhq
Expand All @@ -47,7 +70,8 @@ Jan is stored in the root `~/jan` by default.
/app.txt
/models
/model_A
model.json
model.yaml
model_A.yaml
/settings
settings.json
/@janhq
Expand All @@ -58,6 +82,11 @@ Jan is stored in the root `~/jan` by default.
/joi-dark
/joi-light
/night-blue
/themes
/dark-dimmed
/joi-dark
/joi-light
/night-blue
/threads
/jan_thread_A
messages.jsonl
Expand Down Expand Up @@ -100,7 +129,6 @@ Each parameter in the file is defined as follows:
| description | Describes the assistant’s capabilities and intended role. |
| model | Defines accessible models, with "*" representing access to all models. |
| instructions | Specifies queries and commands to tailor Jan's responses for improved interaction effectiveness. |
- **Custom Assistant Example**: The `/assistants/shakespeare/` shows a custom setup, also with its own `assistant.json`.

### `extensions/`

Expand All @@ -126,6 +154,7 @@ General settings for the application are stored here, separate from individual a
- **General Settings**: The `settings.json` in the `/settings/` directory holds application-wide settings.
- **Extension-specific Settings**: Additional settings for extensions are stored in respective subdirectories under `/settings/@janhq/`.


### `themes/`

The `themes` directory contains different visual themes for the application, allowing customization of the user interface.
Expand All @@ -134,6 +163,7 @@ The `themes` directory contains different visual themes for the application, all

Threads history is kept in this directory. Each session or thread is stored in a way that makes it easy to review past interactions. Each thread is stored in its subdirectory, such as `/threads/jan_unixstamp/`, with files like `messages.jsonl` and `thread.json` detailing the thread settings.


## Open the Data Folder

To open the Jan data folder, follow the steps in the [Settings](/docs/settings#access-the-jan-data-folder) guide.
Expand Down
Loading

0 comments on commit 1443c26

Please sign in to comment.