Rendering images in terminal #384
Replies: 14 comments 20 replies
-
That would be a nice feature, but I probably wouldn't implement it in the main library. I need to draw a line under the feature-set to keep it maintainable. I'm hoping that there will eventually be an ecosystem of third-party libraries using the Console protocol for these things. Maybe it's something you would like to tackle yourself? |
Beta Was this translation helpful? Give feedback.
-
iterm2 has an image protocol: https://iterm2.com/documentation-images.html . My particular use case is converting Jupyter notebooks to markdown and then showing them on terminal by using rich to show the markdown, with images. This repo on GitHub shows how to use the rich console protocol to render cells: https://github.com/sandal-tan/nbcat/blob/master/nbcat/notebook.py but how ought one to hook into the markdown to render images using items image protocol? Suspect I need to change the following methods in ImageItem: def on_enter(self, context: "MarkdownContext") -> None:
self.link = context.current_style.link
self.text = Text(justify="left")
super().on_enter(context)
def __rich_console__(
self, console: Console, options: ConsoleOptions
) -> RenderResult:
link_style = Style(link=self.link or self.destination or None)
title = self.text or Text(self.destination.strip("/").rsplit("/", 1)[-1])
if self.hyperlinks:
title.stylize(link_style)
yield Text.assemble("🌆 ", title, " ", end="") but should the text yielded just be the escape sequence? Or ought I be using some other machinery within rich? |
Beta Was this translation helpful? Give feedback.
-
You'll need to wrap the escape sequence in a Control instance to prevent Rich from word wrapping etc. I'm not sure if the iTerm protocol could be integrated more deeply in to Rich. i.e. could you put it in a Panel? I suspect not. |
Beta Was this translation helpful? Give feedback.
-
In looking into this, it seems that there are a couple of terminal protocols that allow for images in terminals. (sixel and ReGIS in particular.) These seem like they could be better things to look into using, rather than the proprietary protocol that iTerm uses. (I believe iTerm supports sixel anyway.) |
Beta Was this translation helpful? Give feedback.
-
Is anyone still interested in this? I really wanted this for a personal project I'm working on. Here is some code I wrote that does an ok job writing to string in console markup. Not the best but ok. I may keep working on this by itself as a separate project at some point and publish to pypi.
|
Beta Was this translation helpful? Give feedback.
-
@willmcgugan Ofc there is a big BUT: So if you are interested in such a component, I might be able to help with the impl to some extend. NB: |
Beta Was this translation helpful? Give feedback.
-
We've thinking about integrating sixel in Term.jl, a Julia library inspired by rich: FedeClaudi/Term.jl#94 It's probably not trivial, but if anyone can do it it's Will :P |
Beta Was this translation helpful? Give feedback.
-
I'm interested in adding support for iTerm2 inline images in my project, which is currently based on Rich for most of its other rich text formatting. At least initially, I don't need images to be a full-blown Rich renderable (to put in tables, etc.), but I would like to output a mix of rich text and these image escape sequences to a Rich console. It looks like I need to use a Control() instance for this, as mentioned above, but I'm not quite seeing how to add a custom Control() sequence for this. It looks like the options for ControlType and the mapping from those to ANSI escape sequences are pre-defined. Without modifying/patching the Rich library, is there a way I can add a custom escape sequence that doesn't impact formatting of other rich output? On a related note, is there a way to extend the list of allowed markup tokens? I was thinking of trying to add a new [img] tag. The markup parser does seem to unknown tags in some cases, but later this information gets stripped out when trying to use on a call to render(), leading to some rather ugly hacks to try and preserve this information long enough to know where images need to be inserted. Is there a hook of some kind to add new markup tokens? Thanks for any tips you can provide, and for all your work on Rich! It's a really excellent library, and I'm loving what it is capable of! |
Beta Was this translation helpful? Give feedback.
-
Following up on this, I eventually figured out that passing "control=True" does work in a call to Segment(), and there's no need to use a Control() instance or extend that to support new escape sequences. My first cut at implementing a renderable for iTerm2 inline images looked something like the following: class InlineImage:
def __init__(self, data, **kwargs):
def _b64(value):
return base64.b64encode(value).decode('ascii')
kwargs.update(inline=1)
if 'name' in kwargs:
kwargs['name'] = _b64(kwargs['name'].encode('utf-8'))
args = ';'.join(f'{k}={v}' for k, v in kwargs.items())
self._ctrl = f'\x1b]1337;File={args}:{_b64(data)}\a'
def __rich_console__(self, console, options):
yield Segment(self._ctrl, control=True)
yield Segment('\n') This works as long as you only try to render images on lines by themselves, and doing so will always leave the cursor at the beginning of the line following the bottom of the image rendered. The InlineImage() class takes arguments described at https://iterm2.com/documentation-images.html to scale the image to different sizes (using width, height, and preserveAspectRatio). The "inline" argument will always automatically be set, and it would be easy to do that for the size argument as well, but it looks like that's not a required field. You can also optionally set a "name" for the image. I had an additional requirement that I wanted to be able to print a prefix to the left of the image on all the lines it took up. I had previously asked about how to do this for other renderables in #867 and the class InlineImage:
def __init__(self, data, lines=1, **kwargs):
def _b64(value):
return base64.b64encode(value).decode('ascii')
kwargs.update(inline=1)
if 'name' in kwargs:
kwargs['name'] = _b64(kwargs['name'].encode('utf-8'))
args = ';'.join(f'{k}={v}' for k, v in kwargs.items())
move = f'\x1b[{lines-1}A' if lines > 1 else ''
self._ctrl = f'\x1b]1337;File={args}:{_b64(data)}\a{move}'
self._text = lines * '\n'
def __rich_console__(self, console, options):
yield Segment(self._ctrl, control=True)
yield Segment(self._text) However, this requires that the caller pass in the Once It might be possible to draw a table completely FIRST and then move the cursor back up to the appropriate positions for each of the images you want to draw, moving the cursor back down at the end. The tricky part here would be determining what exact row and column each image should be drawn at. In theory, some kind of placeholders could be put into the table that could then be analyzed later by using |
Beta Was this translation helpful? Give feedback.
-
I've played around a bit with To test the following, you will need the test_kitty_img.py#!/usr/bin/env python3
import io
from base64 import standard_b64encode
import requests
from PIL import Image
from textual.app import App
from textual.widgets import Static
from rich import print
from rich.segment import Segment
url = 'https://github.com/textualize/rich/raw/master/imgs/features.png'
class KittyImage:
def __init__(self, url):
# download the image, resize and convert to png
img_response = requests.get(url, stream=True)
img = Image.open(io.BytesIO(img_response.content))
self.png = io.BytesIO()
img.resize(size=(500, 500)).save(self.png, format='png')
# fill up the buffer using the function from the example
self.buf = io.BytesIO()
self.write_chunked(a='T', f=100)
self.buf.seek(0)
# generate a Segment for rich to display
self.segment = Segment(self.buf.read().decode())
# the following two methods are essentially unchanged from the example in
# https://sw.kovidgoyal.net/kitty/graphics-protocol/#a-minimal-example
@staticmethod
def serialize_gr_command(**cmd):
payload = cmd.pop('payload', None)
cmd = ','.join(f'{k}={v}' for k, v in cmd.items())
ans = []
w = ans.append
w(b'\033_G'), w(cmd.encode('ascii'))
if payload:
w(b';')
w(payload)
w(b'\033\\')
return b''.join(ans)
def write_chunked(self, **cmd):
self.png.seek(0)
data = standard_b64encode(self.png.read())
while data:
chunk, data = data[:4096], data[4096:]
m = 1 if data else 0
self.buf.write(self.serialize_gr_command(payload=chunk, m=m, **cmd))
self.buf.flush()
cmd.clear()
def __rich_console__(self, console, options):
yield self.segment
# small app example
class Img(Static):
def get_content_width(self, container, viewport):
return 50
class ImageApp(App):
def compose(self):
yield Img(KittyImage(url))
if __name__ == "__main__":
app = ImageApp()
app.run()
img = KittyImage(url)
print(repr(img))
print(img) When I put the code above inside from test_kitty_img import url, KittyImage
KittyImage(url) Awesome! However, running Similarly, if you simply run the code above as a script (which will open a textual UI and immediatly after print the image with It feels so close to working, but I'm missing something to actually be able to use it in an textual app (or at least a command line tool). Any suggestions? |
Beta Was this translation helpful? Give feedback.
-
The Python package, image-in-terminal, performs well by replacing every two pixels of an image with the character ▀ (Upper Half Block). This makes it most suitable for displaying low-resolution images. While images with higher resolutions can also be displayed in the terminal, performance may decrease and the terminal’s contents will need to be zoomed out to view the images. This package served my small project well, and it might also be useful for others |
Beta Was this translation helpful? Give feedback.
-
Seems like this thread is open for a while now, but as I played around with Kitty's Terminal Graphics Protocol and Textual recently I'll leave my results here. Maybe someone finds them interesting. And got a working solution, even though it's a bit hacky: #!/usr/bin/env python
import io
import sys
from base64 import b64encode
from PIL import Image
from click import style
from textual.widgets import Label
from rich.segment import Segment
from textual.widget import Widget
from textual.app import App, ComposeResult
from textual.geometry import Size, NULL_SIZE
from textual.containers import Center, Middle
from rich.console import (
Console,
ConsoleOptions,
ConsoleRenderable,
RenderResult,
RichCast,
)
from rich.style import Style
PLACEHOLDER = 0x10EEEE
# fmt: off
NUMBER_TO_DIACRITIC = [
0x00305, 0x0030d, 0x0030e, 0x00310, 0x00312, 0x0033d, 0x0033e, 0x0033f, 0x00346, 0x0034a, 0x0034b, 0x0034c, 0x00350, 0x00351, 0x00352, 0x00357,
0x0035b, 0x00363, 0x00364, 0x00365, 0x00366, 0x00367, 0x00368, 0x00369, 0x0036a, 0x0036b, 0x0036c, 0x0036d, 0x0036e, 0x0036f, 0x00483, 0x00484,
0x00485, 0x00486, 0x00487, 0x00592, 0x00593, 0x00594, 0x00595, 0x00597, 0x00598, 0x00599, 0x0059c, 0x0059d, 0x0059e, 0x0059f, 0x005a0, 0x005a1,
0x005a8, 0x005a9, 0x005ab, 0x005ac, 0x005af, 0x005c4, 0x00610, 0x00611, 0x00612, 0x00613, 0x00614, 0x00615, 0x00616, 0x00617, 0x00657, 0x00658,
0x00659, 0x0065a, 0x0065b, 0x0065d, 0x0065e, 0x006d6, 0x006d7, 0x006d8, 0x006d9, 0x006da, 0x006db, 0x006dc, 0x006df, 0x006e0, 0x006e1, 0x006e2,
0x006e4, 0x006e7, 0x006e8, 0x006eb, 0x006ec, 0x00730, 0x00732, 0x00733, 0x00735, 0x00736, 0x0073a, 0x0073d, 0x0073f, 0x00740, 0x00741, 0x00743,
0x00745, 0x00747, 0x00749, 0x0074a, 0x007eb, 0x007ec, 0x007ed, 0x007ee, 0x007ef, 0x007f0, 0x007f1, 0x007f3, 0x00816, 0x00817, 0x00818, 0x00819,
0x0081b, 0x0081c, 0x0081d, 0x0081e, 0x0081f, 0x00820, 0x00821, 0x00822, 0x00823, 0x00825, 0x00826, 0x00827, 0x00829, 0x0082a, 0x0082b, 0x0082c,
0x0082d, 0x00951, 0x00953, 0x00954, 0x00f82, 0x00f83, 0x00f86, 0x00f87, 0x0135d, 0x0135e, 0x0135f, 0x017dd, 0x0193a, 0x01a17, 0x01a75, 0x01a76,
0x01a77, 0x01a78, 0x01a79, 0x01a7a, 0x01a7b, 0x01a7c, 0x01b6b, 0x01b6d, 0x01b6e, 0x01b6f, 0x01b70, 0x01b71, 0x01b72, 0x01b73, 0x01cd0, 0x01cd1,
0x01cd2, 0x01cda, 0x01cdb, 0x01ce0, 0x01dc0, 0x01dc1, 0x01dc3, 0x01dc4, 0x01dc5, 0x01dc6, 0x01dc7, 0x01dc8, 0x01dc9, 0x01dcb, 0x01dcc, 0x01dd1,
0x01dd2, 0x01dd3, 0x01dd4, 0x01dd5, 0x01dd6, 0x01dd7, 0x01dd8, 0x01dd9, 0x01dda, 0x01ddb, 0x01ddc, 0x01ddd, 0x01dde, 0x01ddf, 0x01de0, 0x01de1,
0x01de2, 0x01de3, 0x01de4, 0x01de5, 0x01de6, 0x01dfe, 0x020d0, 0x020d1, 0x020d4, 0x020d5, 0x020d6, 0x020d7, 0x020db, 0x020dc, 0x020e1, 0x020e7,
0x020e9, 0x020f0, 0x02cef, 0x02cf0, 0x02cf1, 0x02de0, 0x02de1, 0x02de2, 0x02de3, 0x02de4, 0x02de5, 0x02de6, 0x02de7, 0x02de8, 0x02de9, 0x02dea,
0x02deb, 0x02dec, 0x02ded, 0x02dee, 0x02def, 0x02df0, 0x02df1, 0x02df2, 0x02df3, 0x02df4, 0x02df5, 0x02df6, 0x02df7, 0x02df8, 0x02df9, 0x02dfa,
0x02dfb, 0x02dfc, 0x02dfd, 0x02dfe, 0x02dff, 0x0a66f, 0x0a67c, 0x0a67d, 0x0a6f0, 0x0a6f1, 0x0a8e0, 0x0a8e1, 0x0a8e2, 0x0a8e3, 0x0a8e4, 0x0a8e5,
0x0a8e6, 0x0a8e7, 0x0a8e8, 0x0a8e9, 0x0a8ea, 0x0a8eb, 0x0a8ec, 0x0a8ed, 0x0a8ee, 0x0a8ef, 0x0a8f0, 0x0a8f1, 0x0aab0, 0x0aab2, 0x0aab3, 0x0aab7,
0x0aab8, 0x0aabe, 0x0aabf, 0x0aac1, 0x0fe20, 0x0fe21, 0x0fe22, 0x0fe23, 0x0fe24, 0x0fe25, 0x0fe26, 0x10a0f, 0x10a38, 0x1d185, 0x1d186, 0x1d187,
0x1d188, 0x1d189, 0x1d1aa, 0x1d1ab, 0x1d1ac, 0x1d1ad, 0x1d242, 0x1d243, 0x1d244
]
# fmt: on
class KittyImage(Widget):
_next_image_id = 1
class _Renderable:
def __init__(self, image_id: int, size: Size) -> None:
self._image_id = image_id
self._size = size
def __rich_console__(
self, _console: Console, _options: ConsoleOptions
) -> RenderResult:
style = Style(color=f"rgb({(self._image_id >> 16) & 255}, {(self._image_id >> 8) & 255}, {self._image_id & 255})")
id_char = NUMBER_TO_DIACRITIC[(self._image_id >> 24) & 255]
for r in range(self._size.height):
line = ""
for c in range(self._size.width):
line += f"{chr(PLACEHOLDER)}{chr(NUMBER_TO_DIACRITIC[r])}{chr(NUMBER_TO_DIACRITIC[c])}{chr(id_char)}"
line += "\n"
yield Segment(line, style=style)
def __init__(
self,
image: Image,
*,
name: str | None = None,
id: str | None = None,
classes: str | None = None,
disabled: bool = False,
) -> None:
super().__init__(name=name, id=id, classes=classes, disabled=disabled)
image_buffer = io.BytesIO()
image.save(image_buffer, format="png")
self._image_data = image_buffer.getvalue()
self._image_id = KittyImage._next_image_id
KittyImage._next_image_id += 1
self._placement_size = NULL_SIZE
self._send_image_to_terminal()
def _send_image_to_terminal(self) -> None:
data = b64encode(self._image_data)
while data:
chunk, data = data[:4096], data[4096:]
ans = [
f"\033_Gi={self._image_id},m={1 if data else 0},f=100,q=2".encode(
"ascii"
)
]
if chunk:
ans.append(b";")
ans.append(chunk)
ans.append(b"\033\\")
# Dangerous. Could interfer with the writer thread. But we can't use textual's functions
# to write to the terminal.
# It buffers output. There's no way around that (Driver.flush() is a no-op).
# This buffering re-chunks the data which leads to a failed transmission.
sys.__stdout__.buffer.write(b"".join(ans))
sys.__stdout__.buffer.flush()
def _create_virtual_placement(self, size: Size) -> None:
# Same issue as above, even though the size of the data probably would still work with the
# buffering. But we have this hack in place anyway, so it shouldn't matter anymore.
sys.__stdout__.buffer.write(
f"\033_Ga=p,U=1,i={self._image_id},c={size.width},r={size.height},q=2\033\\".encode(
"ascii"
)
)
sys.__stdout__.flush()
def render(self) -> ConsoleRenderable | RichCast:
if self._placement_size != self.content_size:
self._create_virtual_placement(self.content_size)
self._placement_size = self.content_size
return KittyImage._Renderable(self._image_id, self.content_size)
class ImageApp(App[None]):
def compose(self) -> ComposeResult:
with Center():
with Middle():
yield KittyImage(Image.open("image.png"))
def on_mount(self) -> None:
self.query_one(KittyImage).styles.width = 20
self.query_one(KittyImage).styles.height = 20
if __name__ == "__main__":
ImageApp().run() Disclaimer: This isn't tested very well and could probably be improved. The approach is to send the image to Kitty and use the Unicode Placeholders for the actual display. While it basically works, it comes with a few issues: Writing to the terminal Reading terminal responses Aligning the image in the container Edit: The last statement seems to be wrong. Getting terminal size on both pixels and cells can be done by an ioctl. Therefore it should be straight forward to calculate the padding. It just needs to be implemented. |
Beta Was this translation helpful? Give feedback.
-
Hello I open another issue on textual then I found this. |
Beta Was this translation helpful? Give feedback.
-
I continued to play around with using Kitty's Terminal Graphics Protocol in Textual and rich. And it seems to work quite well now (at least from my tests). I created a Package of my results: https://github.com/lnqs/textual-kitty |
Beta Was this translation helpful? Give feedback.
-
Hi Will,
I was wandering if you had any plan about adding support for images in rich?
Or something like ascii drawings (e.g. see braille)?
Cheers,
Fede
Beta Was this translation helpful? Give feedback.
All reactions