The easiest way to contribute to the Emacs Rust core is to help port C functions over to Rust. In the process of doing so it will often expose gaps where new data structures or functionality is needed. See the list at the end of this document for examples of functions that still need to be ported.
The following sections describe the basic types used in Rune and how they interact with each other.
The type Object
is equivalent to LispObject
in the C core. It is a tagged pointer that is a superset of all possible concrete types. It is always 64 bits. There are other Sum types that represent a different group of concrete types, such as Number
which is all numeric types, or Function
which is all callable types. These can be matched into the different concrete types by using the untag
method.
The Gc<T>
type represents a generic tagged pointer. Object
is actually just a type alias for Gc<ObjectType>
. All Gc<T>
types share the same layout and bit patterns, meaning that any tagged pointer can be cast back to an Object
with into()
or as_obj()
.
Generally to go from a sum type (Object
, List
, Number
, Function
, etc) to a concrete type, you can either match on it with untag()
or convert it to a subtype with try_into()
. To convert a concert type back into a sum type, you can use into()
if it is already a GC managed type (such as LispString
, Cons
, etc) or use into_obj(cx)
if it is not GC managed.
We strive for good interop with Rust native types. Many primitive types can be changed to a Lisp type using the IntoObject
trait. This trait defines one method into_obj(cx)
that converts a type into a lisp tagged pointer. For example "hello".into_obj(cx)
will return a Gc<&LispString>
, which is a tagged pointer to a Lisp String. Alternatively, there is an add
method on Context
that will call into_obj
and then convert it to a Object
. So cx.add("hello")
will return Object
, which is Gc<ObjectType>
. Also a untagged object (e.g. &Cons
) can be converted to a tagged pointer with the tag()
method (returning Gc<&Cons>
).
Lisp lists can be created using the list!
macro. This accepts a variable number of arguments and converts each one into an object. To create just a single Cons
, use Cons::new
or Cons::new1
(when the cdr is nil
).
Builtin symbols that are defined in the Rust source are given constant values and definitions. For example to reference the lisp symbol lambda
, you would use sym::LAMBDA
. These constants can be used in match statements. For example to check if an object is the symbol lambda
, you could use matches!(x, ObjectType::Symbol(sym::LAMBDA))
. This can also be used to create objects from symbols like sym::LAMBDA.into()
. All read symbols are interned. Uninterned symbols can be created with Symbol::new_uninterned
.
nil
is an important value and has special support. Since it is a just a regular symbol, you could just check for nil
by looking for that symbol with matches!(x, ObjectType::Symbol(sym::NIL))
. nil
has a special constant for the ObjectType
enum, so you could write that as matches!(x, ObjectType::NIL)
. But you can also just use x.is_nil()
or x == NIL
. To create a nil
object, just use the constant value NIL
. Same goes for value t
via TRUE
.
Just like in GNU Emacs, ints are represented as a unboxed fixnum. This means that integers can be converted to objects without needing the GC heap, and can be taken directly from an Object
without following a pointer. However their range is less than a i64
due to tagging.
The Context
type is singleton that is passed through out the call graph and contains GC heap and other state. Creating any GC managed object requires using the Context
. This type is normally named cx
, and is passed as the last argument to functions that consume it. Calling garbage_collect
requires mutable access to the context. This ensures that no object that is created with the Context
can survive pass garbage collection (unless it is rooted). There can only be one instance of Context
per thread.
All objects created with a particular context cannot be accessed past garbage collection. To continue accessing an object, it needs to be rooted. This is done via the root!
macro. It either takes an existing object and shadows the name with a rooted version (root!(obj, cx)
), or takes an initializer (root!(x, new(Vec), cx)
or root!(x, init(Vec::new()), cx)
). Rust structures that are rooted have the type Rt<T>
(which stands for “Root Traceable”). Objects that are rooted have the type Rto<T>
(Root Traceable Object) which is a type alias for Rt<Slot<T>>
. The Slot
type allows object pointers to be updated during garbage collection.
The type Env
represents the lisp thread environment. Is passed as the second to last argument when Context
is also used, and passed as the last argument when Context
is not present in a function signature. All Lisp state should be stored in Env
.
New lisp variables are created using the defvar!
macro, which optionally takes a default value. A coresponding symbol with a uppercase SNAKE_CASE name is also created.
Lisp functions are normal Rust functions that are annotated with the #[defun]
proc macro. This macro will create a wrapper that converts Objects
into the requested types and also converts the return value back into an Object
. This allows functions to move much of their type checking out of the function body for cleaner implementations. For example a function that accepts a string and returns a usize could be written like this:
#[defun]
fn my_fun(x: &str) -> usize {
...
}
If a function needs to allocate new objects, it will need to accept a Context
parameter by reference.
#[defun]
fn my_fun(x: &str, cx: &Context) -> usize {
...
}
If a function need to access the environment, it will need to accept a Env
parameter.
#[defun]
fn my_fun(x: &str, env: &Env, cx: &Context) -> usize {
...
}
If a function needs to call garbage_collect
or calls a function that does (via the call!
macro) it will need to take &mut Context
. This means that all arguments need to be rooted as well. This is done by wrapping them in a Rto
type.
#[defun]
fn my_fun(x: &Rto<Object>, env: &Rt<Env>, cx: &mut Context) -> Object {
...
}
When calling a function that takes a mutable context (&mut Context
), Rust will lengthen the mutable borrow for as long as the returned value is accessed. This can be fixed by wrapping the call in the rebind!
macro.
let x = rebind!(my_func(x, &mut cx));
This is usually a sign that you need to root an object.
let x = cx.add("hello");
// root it
root!(x, cx);
mutable_call(&mut cx);
// access the variable again
let x = x.bind(cx);
Functions to manipulate case
- upcase
- downcase
- upcase-initials
- upcase-region
- downcase-region
- capitalize-region
- upcase-initials-region
- upcase-word
- downcase-word
- capitalize-word
Functions that operate on characters
- unibyte-char-to-multibyte
- multibyte-char-to-unibyte
- char-width
- string-width
- unibyte-string
- get-byte
Functions that operate on strings
- string-bytes
- string-distance
- string-equal
- compare-strings
- string-lessp
- string-version-lessp
- string-collate-lessp
- string-collate-equalp
- string-make-multibyte
- string-make-unibyte
- string-as-unibyte
- string-as-multibyte
- string-to-unibyte
- substring
- clear-string
- base64-encode-string
- base64url-encode-string
- base64-decode-string
- base64-encode-region
- base64url-encode-region
- base64-decode-region
Functions that operate on time formats
- time-add
- time-subtract
- time-less-p
- time-equal-p
- float-time
- format-time-string
- decode-time
- encode-time
- time-convert
- current-cpu-time
- current-time-string
- current-time-zone
- set-time-zone-rule
Functions for working with directories
- directory-files
- directory-files-and-attributes
- file-name-completion
- file-name-all-completions
- file-attributes-lessp
- system-users
- system-groups
Functions that operate on files
- make-temp-file-internal
- make-temp-name
- substitute-in-file-name
- copy-file
- make-directory-internal
- delete-directory-internal
- delete-file
- rename-file
- add-name-to-file
- make-symbolic-link
- file-name-absolute-p
- file-exists-p
- file-executable-p
- file-readable-p
- file-writable-p
- access-file
- file-accessible-directory-p
- file-regular-p
- file-selinux-context
- set-file-selinux-context
- file-acl
- set-file-acl
- file-modes
- set-file-modes
- set-default-file-modes
- default-file-modes
- set-file-times
- unix-sync
- file-newer-than-file-p
- insert-file-contents
- write-region
- verify-visited-file-modtime
- visited-file-modtime
- set-visited-file-modtime
- do-auto-save
- set-buffer-auto-saved
- clear-buffer-auto-save-failure
- recent-auto-save-p
- next-read-file-uses-dialog-p
- set-binary-mode
- file-system-info
Function to edit buffers and manipulate text.
- byte-to-string
- pos-bol
- line-beginning-position
- pos-eol
- line-end-position
- buffer-size
- point-min
- point-max
- gap-position
- gap-size
- position-bytes
- byte-to-position
- following-char
- preceding-char
- bobp
- eobp
- bolp
- eolp
- char-after
- char-before
- user-login-name
- user-real-login-name
- user-uid
- user-real-uid
- group-name
- group-gid
- group-real-gid
- user-full-name
- system-name
- emacs-pid
- insert
- insert-and-inherit
- insert-char
- insert-byte
- buffer-substring
- buffer-string
- insert-buffer-substring
- compare-buffer-substrings
- replace-buffer-contents
- subst-char-in-region
- translate-region-internal
- delete-region
- delete-and-extract-region
- internal–unlabel-restriction
- save-restriction
- ngettext
- message
- message-box
- message-or-box
- current-message
- propertize
- format
- format-message
- char-equal
- transpose-regions