Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to have metastore partitioning column as a string, not a date #424

Open
yruslan opened this issue Jun 7, 2024 · 0 comments
Labels
DS enhancement New feature or request Pramen-Scala

Comments

@yruslan
Copy link
Collaborator

yruslan commented Jun 7, 2024

Background

Currently, information date format and type are ignored when the metastore persistence format is 'delta'.

For example, here the date format will be ignored:

pramen.metastore.tables = [
    {
      name = "table1"
      description = "Table 1 description"
      format = "delta"
      path = ${base.path}/table1

      information.date.column = "info_date"
      information.date.format = "yyyyMM"
    }
]

while here it will work:

pramen.metastore.tables = [
    {
      name = "table1"
      description = "Table 1 description"
      format = "parquet"
      path = ${base.path}/table1

      information.date.column = "info_date"
      information.date.format = "yyyyMM"
    }
]

(the only difference is the storage format)

Feature

Add the ability to have metastore partitioning column as a string, not a date.

Example

--

Proposed Solution

Pramen writes Parquet partitions directly, it constructs the path. For Delta we rely on the type system. So I guess we need to treat any date format that is not 'yyyy-MM-dd' as having string type.

Code to look:

@yruslan yruslan added enhancement New feature or request DS Pramen-Scala labels Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DS enhancement New feature or request Pramen-Scala
Projects
None yet
Development

No branches or pull requests

1 participant