Enums, enumerations and database types #1184
Replies: 6 comments 15 replies
-
Currently, we just extend the https://docs.sqlalchemy.org/en/14/changelog/changelog_08.html?highlight=ischema_names After a cursory glance at the actual code, it seems that the |
Beta Was this translation helpful? Give feedback.
-
I don't have much to contribute to this discussion beyond the link to the
In order to have concrete suggestions for refactors, I'd need to basically do the work for #1100. I'd be happy to comment on any concrete suggestions offered by someone else. One very general idea I have is that we could have a single file to hold information about each type in a standard format in |
Beta Was this translation helpful? Give feedback.
-
The #1100 issue this discussion stemmed from (which I didn't mention in top post) is not a quick refactor, since the refactorer (me) needs to get a handle on quite a few things. Our type logic is extensive and has a fair amount of technical debt to work through. I've been doing a slow-trickle of a refactor slash exploration of related type logic. A central goal I have is replace most, if not all, of type strings or type classes being passed around with Enums. A thing I did was have Excerpt: class DatabaseType:
[...]
@property
def ischema_key(self):
"""
Looks up this type's canonical type (if it is an alias) and returns its string id that may
correspond to keys on the SA ischema_names dict.
Note that PostgresType values are already such keys. However, MathesarCustomType values
require adding a qualifier prefix.
"""
canonical_id = self.canonical_id
if isinstance(self, MathesarCustomType):
ischema_key = get_qualified_name(canonical_id)
else:
ischema_key = canonical_id
return ischema_key
class PostgresType(DatabaseType, Enum):
[...]
class MathesarCustomType(DatabaseType, Enum):
[...] A problem right now is that if I start changing strings with Enums I'll have a hell of a time tracking down all the places that were expecting a string, but got an Enum. A possible approach could be to add type annotations to all the relevant code. Should be laborious, but not hard. I'd remove the annotations before merge. |
Beta Was this translation helpful? Give feedback.
-
We had another sync with @mathemancer. We reflected on the fact that it's interesting that we're exposing database type alias information to the frontend. @mathemancer pointed out (correctly, I think) that ideally we wouldn't need to have to do that. We realised that the alias information was requested by frontend team, because we were telling the frontend about database types that are possibly aliases of one another. For example, both We discussed some solutions. Our goal would be to only use and expose types that "reflect as themselves", meaning if you create a column of type X and reflect it, it should reflect as type X. That is currently not the case for a few types. Noteworthy examples:
Above "solutions" would result in a database type set that's much easier to handle, both for the backend and frontend, because we need to be able to tell what type a column will be if we apply some type to it. There's another approach that might be worth thinking about. We could explicitly model the fact that [0] @mathemancer showed me that |
Beta Was this translation helpful? Give feedback.
-
An interesting aspect of our type logic is that custom Mathesar types use qualified identifiers in some places (e.g. |
Beta Was this translation helpful? Give feedback.
-
I've made changes to the proposed class DatabaseType:
@property
def id(self):
# Here we're defining Enum's value attribute to be the database type id.
# However, the meaning of this id is not clear, since often you'll use the ischema_key
# below. I've not yet fully conceptualized this duality.
return self.value
# TODO it would be great to merge id(self) and ischema_key(self). for that we'd need to factor
# out the difference in how PostgresType and MathesarCustomType ids are handled. specifically,
# MathesarCustomType ids require adding a prefix.
@property
def ischema_key(self):
"""
Returns the key corresponding to this type on the SA ischema_names dict.
Note that PostgresType values are already such keys. However, MathesarCustomType values
require adding a qualifier prefix.
"""
id = self.id
if isinstance(self, MathesarCustomType):
return get_qualified_name(id)
else:
return id
def get_sa_class(self, engine):
"""
Returns the SA class corresponding to this type or None if this type is not supported by
provided engine, or if it's ignored (see is_ignored).
"""
if not self.is_ignored:
ischema_names = engine.dialect.ischema_names
return ischema_names.get(self.ischema_key)
def is_available(self, engine):
"""
Returns true if this type is available on provided engine.
"""
return self.get_sa_class(engine) is not None
@property
def is_ignored(self):
"""
We ignore some types. Current rule is that if type X is applied to a column, but upon
reflection that column is of some other type, we ignore type X. This mostly means
ignoring aliases. It also ignores NAME and CHAR, because both are reflected as the SA
String type.
"""
ignored_types = (
PostgresType.TIME,
PostgresType.TIMESTAMP,
PostgresType.DECIMAL,
PostgresType.NAME,
PostgresType.CHAR,
)
return self in ignored_types I'm pretty excited to see where this takes our type code. |
Beta Was this translation helpful? Give feedback.
-
We had a sync call with @mathemancer about this. I summed up our sync in the following DM message:
Topics we might want to expand on:
engine.dialect.ischema_names
and reflection in this context;Beta Was this translation helpful? Give feedback.
All reactions