Skip to content

Commit

Permalink
Metadata values containing jsonb arrays are now supported. (#18)
Browse files Browse the repository at this point in the history
Predicates can take a list as a value. A comparison operator
of "@>" now tests for array containment. This is an indexed
operation. We allow for list/tuple PredicateValues with
elements of mixed type.

Fixes to upgrade to latest openai
  • Loading branch information
jgpruitt authored May 14, 2024
1 parent babd036 commit f6da527
Show file tree
Hide file tree
Showing 7 changed files with 205 additions and 195 deletions.
48 changes: 24 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,12 +120,12 @@ Now, you can query for similar items:
vec.search([1.0, 9.0])
```

[[UUID('73d05df0-84c1-11ee-98da-6ee10b77fd08'),
[[UUID('45ecb666-0f15-11ef-8d89-e666703872d0'),
{'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
0.00016793422934946456],
[UUID('73d05d6e-84c1-11ee-98da-6ee10b77fd08'),
[UUID('45ecb350-0f15-11ef-8d89-e666703872d0'),
{'animal': 'fox'},
'the brown fox',
array([1. , 1.3], dtype=float32),
Expand All @@ -141,7 +141,7 @@ constrained by a metadata filter.
vec.search([1.0, 9.0], limit=1, filter={"action": "jump"})
```

[[UUID('73d05df0-84c1-11ee-98da-6ee10b77fd08'),
[[UUID('45ecb666-0f15-11ef-8d89-e666703872d0'),
{'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
Expand All @@ -165,7 +165,7 @@ records = vec.search([1.0, 9.0], limit=1, filter={"action": "jump"})
(records[0]["id"],records[0]["metadata"], records[0]["contents"], records[0]["embedding"], records[0]["distance"])
```

(UUID('73d05df0-84c1-11ee-98da-6ee10b77fd08'),
(UUID('45ecb666-0f15-11ef-8d89-e666703872d0'),
{'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
Expand Down Expand Up @@ -228,12 +228,12 @@ The basic query looks like:
vec.search([1.0, 9.0])
```

[[UUID('7487af96-84c1-11ee-98da-6ee10b77fd08'),
[[UUID('4d629b54-0f15-11ef-8d89-e666703872d0'),
{'times': 100, 'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
0.00016793422934946456],
[UUID('7487af14-84c1-11ee-98da-6ee10b77fd08'),
[UUID('4d629a50-0f15-11ef-8d89-e666703872d0'),
{'times': 1, 'action': 'sit', 'animal': 'fox'},
'the brown fox',
array([1. , 1.3], dtype=float32),
Expand All @@ -245,7 +245,7 @@ You could provide a limit for the number of items returned:
vec.search([1.0, 9.0], limit=1)
```

[[UUID('7487af96-84c1-11ee-98da-6ee10b77fd08'),
[[UUID('4d629b54-0f15-11ef-8d89-e666703872d0'),
{'times': 100, 'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
Expand All @@ -270,7 +270,7 @@ unconstrained):
vec.search([1.0, 9.0], limit=1, filter={"action": "sit"})
```

[[UUID('7487af14-84c1-11ee-98da-6ee10b77fd08'),
[[UUID('4d629a50-0f15-11ef-8d89-e666703872d0'),
{'times': 1, 'action': 'sit', 'animal': 'fox'},
'the brown fox',
array([1. , 1.3], dtype=float32),
Expand All @@ -283,12 +283,12 @@ returned if it matches any dict:
vec.search([1.0, 9.0], limit=2, filter=[{"action": "jump"}, {"animal": "fox"}])
```

[[UUID('7487af96-84c1-11ee-98da-6ee10b77fd08'),
[[UUID('4d629b54-0f15-11ef-8d89-e666703872d0'),
{'times': 100, 'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
0.00016793422934946456],
[UUID('7487af14-84c1-11ee-98da-6ee10b77fd08'),
[UUID('4d629a50-0f15-11ef-8d89-e666703872d0'),
{'times': 1, 'action': 'sit', 'animal': 'fox'},
'the brown fox',
array([1. , 1.3], dtype=float32),
Expand All @@ -303,7 +303,7 @@ could use greater than and less than conditions on numeric values.
vec.search([1.0, 9.0], limit=2, predicates=client.Predicates("times", ">", 1))
```

[[UUID('7487af96-84c1-11ee-98da-6ee10b77fd08'),
[[UUID('4d629b54-0f15-11ef-8d89-e666703872d0'),
{'times': 100, 'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
Expand All @@ -327,7 +327,7 @@ use the right type. Supported Python types are: `str`, `int`, and
vec.search([1.0, 9.0], limit=2, predicates=client.Predicates("action", "==", "jump"))
```

[[UUID('7487af96-84c1-11ee-98da-6ee10b77fd08'),
[[UUID('4d629b54-0f15-11ef-8d89-e666703872d0'),
{'times': 100, 'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
Expand All @@ -341,7 +341,7 @@ combining using OR semantic). So you can do:
vec.search([1.0, 9.0], limit=2, predicates=client.Predicates("action", "==", "jump") & client.Predicates("times", ">", 1))
```

[[UUID('7487af96-84c1-11ee-98da-6ee10b77fd08'),
[[UUID('4d629b54-0f15-11ef-8d89-e666703872d0'),
{'times': 100, 'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
Expand All @@ -364,7 +364,7 @@ my_predicates = client.Predicates("action", "==", "jump") & (client.Predicates("
vec.search([1.0, 9.0], limit=2, predicates=my_predicates)
```

[[UUID('7487af96-84c1-11ee-98da-6ee10b77fd08'),
[[UUID('4d629b54-0f15-11ef-8d89-e666703872d0'),
{'times': 100, 'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
Expand All @@ -378,7 +378,7 @@ semantics. You can pass in multiple 3-tuples to
vec.search([1.0, 9.0], limit=2, predicates=client.Predicates(("action", "==", "jump"), ("times", ">", 10)))
```

[[UUID('7487af96-84c1-11ee-98da-6ee10b77fd08'),
[[UUID('4d629b54-0f15-11ef-8d89-e666703872d0'),
{'times': 100, 'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
Expand Down Expand Up @@ -410,7 +410,7 @@ Then, you can filter using the timestamps by specifing a
tpvec.search([1.0, 9.0], limit=4, uuid_time_filter=client.UUIDTimeRange(specific_datetime, specific_datetime+timedelta(days=1)))
```

[[UUID('33c52800-ef15-11e7-be03-4f1f9a1bde5a'),
[[UUID('95899000-ef1d-11e7-990e-7d2f7e013038'),
{'times': 1, 'action': 'sit', 'animal': 'fox'},
'the brown fox',
array([1. , 1.3], dtype=float32),
Expand All @@ -426,12 +426,12 @@ unconstrained.
tpvec.search([1.0, 9.0], limit=4, uuid_time_filter=client.UUIDTimeRange(start_date=specific_datetime))
```

[[UUID('ac8be800-0de6-11e9-889a-5eec84ba8a7b'),
[[UUID('0e505000-0def-11e9-8732-a154fea6fb50'),
{'times': 100, 'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
0.00016793422934946456],
[UUID('33c52800-ef15-11e7-be03-4f1f9a1bde5a'),
[UUID('95899000-ef1d-11e7-990e-7d2f7e013038'),
{'times': 1, 'action': 'sit', 'animal': 'fox'},
'the brown fox',
array([1. , 1.3], dtype=float32),
Expand All @@ -448,7 +448,7 @@ One example:
tpvec.search([1.0, 9.0], limit=4, uuid_time_filter=client.UUIDTimeRange(start_date=specific_datetime, start_inclusive=False))
```

[[UUID('ac8be800-0de6-11e9-889a-5eec84ba8a7b'),
[[UUID('0e505000-0def-11e9-8732-a154fea6fb50'),
{'times': 100, 'action': 'jump', 'animal': 'fox'},
'jumped over the',
array([ 1. , 10.8], dtype=float32),
Expand All @@ -470,7 +470,7 @@ filters and `__uuid_timestamp` for predicates. Some examples below:
tpvec.search([1.0, 9.0], limit=4, filter={ "__start_date": specific_datetime, "__end_date": specific_datetime+timedelta(days=1)})
```

[[UUID('33c52800-ef15-11e7-be03-4f1f9a1bde5a'),
[[UUID('95899000-ef1d-11e7-990e-7d2f7e013038'),
{'times': 1, 'action': 'sit', 'animal': 'fox'},
'the brown fox',
array([1. , 1.3], dtype=float32),
Expand All @@ -481,7 +481,7 @@ tpvec.search([1.0, 9.0], limit=4,
predicates=client.Predicates("__uuid_timestamp", ">", specific_datetime) & client.Predicates("__uuid_timestamp", "<", specific_datetime+timedelta(days=1)))
```

[[UUID('33c52800-ef15-11e7-be03-4f1f9a1bde5a'),
[[UUID('95899000-ef1d-11e7-990e-7d2f7e013038'),
{'times': 1, 'action': 'sit', 'animal': 'fox'},
'the brown fox',
array([1. , 1.3], dtype=float32),
Expand Down Expand Up @@ -850,7 +850,7 @@ import psycopg2
from langchain.docstore.document import Document
from langchain.text_splitter import CharacterTextSplitter
from timescale_vector import client, pgvectorizer
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain_openai import OpenAIEmbeddings
from langchain.vectorstores.timescalevector import TimescaleVector
from datetime import timedelta
```
Expand Down Expand Up @@ -963,8 +963,8 @@ res = vector_store.similarity_search_with_score("Blogs about cats")
res
```

[(Document(page_content='Author Matvey Arye, title: First Post, contents:some super interesting content about cats.', metadata={'id': '4a784000-4bc4-11eb-855a-06302dbc8ce7', 'author': 'Matvey Arye', 'blog_id': 1, 'category': 'AI', 'published_time': '2021-01-01T00:00:00+00:00'}),
0.12595687795193833)]
[(Document(page_content='Author Matvey Arye, title: First Post, contents:some super interesting content about cats.', metadata={'id': '4a784000-4bc4-11eb-979c-e8748f6439f2', 'author': 'Matvey Arye', 'blog_id': 1, 'category': 'AI', 'published_time': '2021-01-01T00:00:00+00:00'}),
0.12657619616729976)]

## Development

Expand Down
Loading

0 comments on commit f6da527

Please sign in to comment.