[Feature] POC: `openbb-store` OBBject Extension For Data/Python Object Interchange #6509

deeleeramone · 2024-06-18T20:47:23Z

This is a WIP and POC, feedback is welcome.

The goal is to facilitate data and Python object interchange, particularly over networks, within the OpenBB Platform ecosystem.

Example Use-Case:

Run a script to collect data for a ticker hitting a number of endpoints.
Add each item, raw OBBject response or a filtered subset, to the Store.
Export the collection as a file to share, use later, or transport across a network.

Simple Use-Case:

Store lists of symbols to be used as function inputs, i.e, watchlist.

Below is pasted from the README.md file.

OBBject Store Extension

openbb-store is an OBBject extension for storing and retrieving OBBjects, Data, DataFrames, dictionaries, lists, and strings.

Each entry is stored as a compressed pickle, with SHA1 signature, using the LZMA module with the "xz" algorithm set to maximum compression.

Installation

Install this extension by navigating into the directory and entering:

pip install -e .

Then, rebuild the Python interface:

python -c "import openbb;openbb.build()"

Store Class

Within the OpenBB Platform, the extension acts as a Global class with methods to add, retrieve, and save groups of data objects to memory or file in a transportable and compressed format.

When used as standalone, the user_data_directory property (preference) should be set to the desired read/write directory
upon initialization. Alternatively, specify the complete path to the file when using the IO methods' filename parameter.

Usage

Every output from the OpenBB Platform Python interface will have the store attribute.

Supported Data Types

The following is a list of supported data objects:

OBBject
Data (generic OpenBB Data class)
DataFrame
List
Dictionary
String

The contents of any object being added must be serializable.

Add Data

from openbb import obb

data = obb.equity.price.historical("NVDA", provider="yfinance", start_date="2023-01-01", end_date="2023-12-31")
data.store.add_store(data=data, name="nvda2023")

A confirmation will display unless the "verbose" property is set to False.

"Data store 'nvda2023' added successfully."

Additonal data can be added to the collection, and then exported as a single package.

data = obb.equity.fundamental.metrics("NVDA", provider="yfinance")
data.store.add_store(data = data.to_df().set_index("symbol").T, name="nvdaMetrics", description="Key Valuation Metrics for NVDA.")

"Data store 'nvdaMetrics' added successfully."

Directory Of Objects

An inventory of stored objects is displayed with the 'directory' property.

data.store.directory

{'nvda2023': {'description': None,
  'data_class': 'OBBject',
  'schema_preview': "{'length': 250, 'fields_set': ['open', 'high', 'low', 'close', 'volume', 'split_..."},
 'nvdaMetrics': {'description': 'Key Valuation Metrics for NVDA.',
  'data_class': 'DataFrame',
  'schema_preview': "{'length': 34, 'width': 1, 'columns': Index(['NVDA'], dtype='object', name='symb..."}}

Schemas

Metadata related to the schema are stored independent of the actual data store.
Schemas are retrieved with the get_schema method, using the assigned 'name' as the key.

Example DataFrame schema:

data.store.get_schema("nvdaMetrics")

{'length': 34,
 'width': 1,
 'columns': Index(['NVDA'], dtype='object', name='symbol'),
 'index': Index(['market_cap', 'pe_ratio', 'forward_pe', 'peg_ratio', 'peg_ratio_ttm',
        'enterprise_to_ebitda', 'earnings_growth', 'earnings_growth_quarterly',
        'revenue_per_share', 'revenue_growth', 'enterprise_to_revenue',
        'quick_ratio', 'current_ratio', 'debt_to_equity', 'gross_margin',
        'operating_margin', 'ebitda_margin', 'profit_margin',
        'return_on_assets', 'return_on_equity', 'dividend_yield',
        'dividend_yield_5y_avg', 'payout_ratio', 'book_value', 'price_to_book',
        'enterprise_value', 'overall_risk', 'audit_risk', 'board_risk',
        'compensation_risk', 'shareholder_rights_risk', 'beta',
        'price_return_1y', 'currency'],
       dtype='object'),
 'types_map': symbol
 NVDA    object
 dtype: object}

Example Pydantic model schema:

data.store.get_schema("nvda2023")

{'length': 250,
 'fields_set': ['open',
  'high',
  'low',
  'close',
  'volume',
  'split_ratio',
  'dividend'],
 'data_model': {'additionalProperties': True,
  'description': 'Yahoo Finance Equity Historical Price Data.',
  'properties': {'date': {'anyOf': [{'format': 'date', 'type': 'string'},
     {'format': 'date-time', 'type': 'string'}],
    'description': 'The date of the data.',
    'title': 'Date'},
   'open': {'description': 'The open price.',
    'title': 'Open',
    'type': 'number'},
   'high': {'description': 'The high price.',
    'title': 'High',
    'type': 'number'},
   'low': {'description': 'The low price.', 'title': 'Low', 'type': 'number'},
   'close': {'description': 'The close price.',
    'title': 'Close',
    'type': 'number'},
   'volume': {'anyOf': [{'type': 'number'},
     {'type': 'integer'},
     {'type': 'null'}],
    'default': None,
    'description': 'The trading volume.',
    'title': 'Volume'},
   'vwap': {'anyOf': [{'type': 'number'}, {'type': 'null'}],
    'default': None,
    'description': 'Volume Weighted Average Price over the period.',
    'title': 'Vwap'},
   'split_ratio': {'anyOf': [{'type': 'number'}, {'type': 'null'}],
    'default': None,
    'description': 'Ratio of the equity split, if a split occurred.',
    'title': 'Split Ratio'},
   'dividend': {'anyOf': [{'type': 'number'}, {'type': 'null'}],
    'default': None,
    'description': 'Dividend amount (split-adjusted), if a dividend was paid.',
    'title': 'Dividend'}},
  'required': ['date', 'open', 'high', 'low', 'close'],
  'title': 'YFinanceEquityHistoricalData',
  'type': 'object'},
 'created_at': '2024-06-18 13:08:44.778360',
 'uid': '06671e94-d271-7d4f-8000-43094acbb703'}

Restore Data

Restore data from the Store extension by using the get_store method. The archive is validated against a signature before opening.

data.store.get_store("nvdaMetrics")

index	NVDA
market_cap	3335037648896.0
pe_ratio	79.28655
forward_pe	37.661114
peg_ratio	1.04
peg_ratio_ttm	1.5532
enterprise_to_ebitda	67.277
earnings_growth	6.5
earnings_growth_quarterly	6.284
revenue_per_share	3.234
revenue_growth	2.621
enterprise_to_revenue	41.556
quick_ratio	2.877
current_ratio	3.529
debt_to_equity	22.866
gross_margin	0.75286
operating_margin	0.64925003
ebitda_margin	0.61768
profit_margin	0.53398
return_on_assets	0.49103
return_on_equity	1.15658
dividend_yield	0.00029999999
dividend_yield_5y_avg	0.0012
payout_ratio	0.0094
book_value	1.998
price_to_book	67.85786
enterprise_value	3315066994688
overall_risk	7.0
audit_risk	7.0
board_risk	10.0
compensation_risk	1.0
shareholder_rights_risk	6.0
beta	1.694
price_return_1y	2.149727
currency	USD

When the stored object is an instance of OBBject, the element to retrieve can be isolated with the element parameter.
By default, it is "dataframe". When set as "OBBject", the object is restored in its original form.

data.store.get_store("nvda2023", element="OBBject")

OBBject

id: 06671e94-d271-7d4f-8000-43094acbb703
results: [{'date': datetime.date(2023, 1, 3), 'open': 14.85099983215332, 'high': 14...
provider: yfinance
warnings: None
chart: None
extra: {'metadata': {'arguments': {'provider_choices': {'provider': 'yfinance'}, 's...

Exporting/Importing

Any item(s) loaded into the extension can be exported to file as a ".xz" archive.
A list of "names" isolates specific objects for writing to disk. Without supplying names,
all entries are exported.

data.store.save_store_to_file(filename="nvda")

Importing works the same way, and a list of "names" can also be included to load only the desired elements.

data.store.load_store_from_file(filename="nvda")

The default path can be overridden by including the complete path, beginning with "/", in the filename.
Do not include the file extension with the name.

…nce/OpenBBTerminal into feature/openbb-store

piiq · 2024-06-19T07:21:42Z

Hey

First of all this might be the best feature PR description in the history of this repo. Thank you for that.

After reading this description I would like to clarify a few things:

The examples here show how you add data to the store of an obbject that you got in response after using a command. When restoring old stores, where do i get the obbject from?
What is the core differentiator or value proposition of using the store versus aggregating .to_json() of the results into a dictionary and then saving it as a json file or saving the results into separate sheets of an excel workbook?

deeleeramone · 2024-06-19T07:56:29Z

1. The examples here show how you add data to the store of an obbject that you got in response after using a command. When restoring old stores, where do i get the obbject from?

It can come from memory or file. Only files that have been exported can be loaded back in. When an OBBject is stored, the entire class is pickled, it is restored by validating against the original signature and then OBBject.model_validate(restored_obbject)

2. What is the core differentiator or value proposition of using the store versus aggregating  .to_json() of the results into a dictionary and then saving it as a json file or saving the results into separate sheets of an excel workbook?

A major differentiation between aggregating .to_json() is that this uses Bytes and not String as a buffer/IO. Additionally, all of the logic is applied and one-liners are all that is required to dump/load collections.

The ability to 'bundle' various objects together as a single export, and maintain the state of an OBBject - with chart and methods etc - is another difference. LLMs could be fed context through curated stores, which can support function calling.

It is not an equivalent to saving the results into an excel workbook, but with Python in Excel, you could unpack the compressed store and access all the original Python objects.

Additionally, schemas for non-OBBject objects will be generated, with a map of {field:type} and dimensions.

The essence is to be a gateway between deployed applications and the Platform.

piiq · 2024-06-19T18:17:45Z

It can come from memory or file. Only files that have been exported can be loaded back in. When an OBBject is stored, the entire class is pickled, it is restored by validating against the original signature and then OBBject.model_validate(restored_obbject)

Can you show a command sequence starting with launching python to loading a store

I understood your rationale for 2. In a nutshell it's storing binary data vs text

deeleeramone · 2024-06-19T21:03:58Z

Can you show a command sequence starting with launching python to loading a store

Yes, in the screenshot below, I have 2 environments. One is my regular OpenBB dev environment, the other is a brand new one with only openbb-core, openbb-fmp, openbb-store installed as packages. The environment on the right does not have the provider interface or any routers, it is just the bare packages.

On the left, I have assembled the three financial statements as a single archive, and then exported it to my OpenBBUserData folder.

On the right, I have loaded the file using the Store class directly - which also makes it operate as a local variable instead of global - and then unpacked the balance sheet item and applied the to_df() method.

from openbb import obb

balance_data = obb.equity.fundamental.balance("NVDA", provider="fmp", period="quarter")

# Assign it to make less keystrokes.
store = balance_data.store

store.add_store(data=balance_data, name="balance", description="NVDA Quarterly Balance Sheet Statements")
cash_data = obb.equity.fundamental.cash("NVDA", provider="fmp", period="quarter")
store.add_store(data=cash_data, name="cash", description="NVDA Quarterly Cash Flow Statements")
income_data = obb.equity.fundamental.income("NVDA", provider="fmp", period="quarter")
store.add_store(data=income_data, name="income", description="NVDA Quarterly Income Statements")
store.save_store_to_file("nvda_financials")

Then on the importing side:

from openbb_store.store import Store

store = Store()
# Use the full path to the file in standalone mode.
store.load_store_from_file("/Users/danglewood/NewOpenBBUserData/stores/nvda_financials")
balance_data = store.get_store("balance", element='OBBject')

piiq · 2024-06-20T11:42:33Z

Then on the importing side:

This is very helpful, thank you.

Please consider allowing creation of a store without the need to pre-initialize an empty instance. Like using either store = Store(file="my/file/path") or a classmethod like store = Store.from_file(path="my/file/path")

…minal into feature/openbb-store

deeleeramone · 2024-06-21T05:20:39Z

Please consider allowing creation of a store without the need to pre-initialize an empty instance. Like using either store = Store(file="my/file/path") or a classmethod like store = Store.from_file(path="my/file/path")

Like so?

…minal into feature/openbb-store

hjoaquim

I'm not sure if this should live under the Platform repo (vs another repo like openbb-forecast vs personal repo); but if we agree it's the right place, everything here looks good to me.

IgorWounds · 2024-07-31T10:52:53Z

I think that this PR should be under its separate repo and as its package. CC: @piiq

deeleeramone · 2024-08-01T01:46:19Z

I think that this PR should be under its separate repo and as its package. CC: @piiq

I have a couple more design considerations before I'd call it "ready". I can move it to another repo when I cross that milestone, will leave open for reference in the meantime.

…to feature/openbb-store

openbb-store

4cbd2f9

deeleeramone added enhancement Enhancement platform OpenBB Platform v4 PRs for v4 labels Jun 18, 2024

deeleeramone requested a review from hjoaquim June 18, 2024 20:47

Merge branch 'develop' into feature/openbb-store

24328a4

deeleeramone marked this pull request as draft June 18, 2024 20:47

deeleeramone added 3 commits June 18, 2024 13:55

codespell

798dfc5

Merge branch 'feature/openbb-store' of https://github.com/OpenBB-fina…

d6bc545

…nce/OpenBBTerminal into feature/openbb-store

handle schema variation

d5da117

deeleeramone requested review from piiq, IgorWounds, montezdesousa and jmaslek June 19, 2024 06:50

no to_dict for df on output

99b1af3

Merge branch 'develop' into feature/openbb-store

e2829b8

Merge branch 'develop' of https://github.com/OpenBB-finance/OpenBBTer…

42eb7ad

…minal into feature/openbb-store

deeleeramone and others added 7 commits June 21, 2024 09:10

Merge branch 'develop' of https://github.com/OpenBB-finance/OpenBBTer…

e54706f

…minal into feature/openbb-store

initialize class from file

34453d9

Merge branch 'develop' into feature/openbb-store

4cb998c

Merge branch 'develop' into feature/openbb-store

3d6c0d8

Merge branch 'develop' into feature/openbb-store

cd3d6e1

Merge branch 'develop' into feature/openbb-store

5854a00

Merge branch 'develop' into feature/openbb-store

de85d38

deeleeramone added 14 commits July 4, 2024 09:57

Merge branch 'develop' into feature/openbb-store

f9779bc

Merge branch 'develop' into feature/openbb-store

10c6205

update lock

0efd5f2

mypy

ebba53b

missing docstring

9c5bcf1

prevent cannot pickle '_contextvars.Context' object

4c508d3

pylint

3196665

black

841ab10

pylint

27fca8e

doctstring

9781be8

compress/decompress as staticmethods

686e0b5

Merge branch 'develop' into feature/openbb-store

e7448d6

Merge branch 'develop' into feature/openbb-store

c23e37a

Merge branch 'develop' into feature/openbb-store

53a78fd

hjoaquim approved these changes Jul 31, 2024

View reviewed changes

deeleeramone added 13 commits July 31, 2024 18:46

Merge branch 'develop' into feature/openbb-store

242e2e4

Merge branch 'develop' of https://github.com/OpenBB-finance/OpenBB in…

0e5b507

…to feature/openbb-store

Merge branch 'develop' of https://github.com/OpenBB-finance/OpenBB in…

39c3c5f

…to feature/openbb-store

add excel file support and default stores

53d8233

lint

86c96a2

linting

96e5270

black

d0a1eef

more linting

14af87b

grammar police

7fa0a69

more linting

3084dd3

Merge branch 'develop' into feature/openbb-store

f892085

Merge branch 'develop' into feature/openbb-store

0e80ac5

Merge branch 'develop' into feature/openbb-store

dfcbb78

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] POC: `openbb-store` OBBject Extension For Data/Python Object Interchange #6509

[Feature] POC: `openbb-store` OBBject Extension For Data/Python Object Interchange #6509

deeleeramone commented Jun 18, 2024 •

edited

Loading

piiq commented Jun 19, 2024

deeleeramone commented Jun 19, 2024

piiq commented Jun 19, 2024

deeleeramone commented Jun 19, 2024 •

edited

Loading

piiq commented Jun 20, 2024

deeleeramone commented Jun 21, 2024

hjoaquim left a comment

IgorWounds commented Jul 31, 2024

deeleeramone commented Aug 1, 2024

[Feature] POC: openbb-store OBBject Extension For Data/Python Object Interchange #6509

Are you sure you want to change the base?

[Feature] POC: openbb-store OBBject Extension For Data/Python Object Interchange #6509

Conversation

deeleeramone commented Jun 18, 2024 • edited Loading

OBBject Store Extension

Installation

Store Class

Usage

Supported Data Types

Add Data

Directory Of Objects

Schemas

Restore Data

Exporting/Importing

piiq commented Jun 19, 2024

deeleeramone commented Jun 19, 2024

piiq commented Jun 19, 2024

deeleeramone commented Jun 19, 2024 • edited Loading

piiq commented Jun 20, 2024

deeleeramone commented Jun 21, 2024

hjoaquim left a comment

Choose a reason for hiding this comment

IgorWounds commented Jul 31, 2024

deeleeramone commented Aug 1, 2024

[Feature] POC: `openbb-store` OBBject Extension For Data/Python Object Interchange #6509

[Feature] POC: `openbb-store` OBBject Extension For Data/Python Object Interchange #6509

deeleeramone commented Jun 18, 2024 •

edited

Loading

deeleeramone commented Jun 19, 2024 •

edited

Loading