Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blog post addition: Serverless dlt + dbt project #4658
Blog post addition: Serverless dlt + dbt project #4658
Changes from 12 commits
d630e94
e9b9f50
d0d4103
cdd475f
314c43c
c8fb3d0
a168f8b
ce714ab
baa0339
2e22e81
58fd4cf
ab90849
eb02e8c
6c142f2
858d555
79d1ba4
5a1879b
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move this section to the top of the article - it lays out clearly what the purpose of this work is, and sets the context for why it's important before diving into the technical implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the second Unfortunately in two sentences - consider a rephrase. Either just changing to something like Frustratingly, or saying something like "As I started browsing, I quickly discovered that there are way fewer properties than ads"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this list is incomplete since it only has one element?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've rounded this out a bit to give a broader overview into the pipeline and then added additional details later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the diagram reflect it? I assume you're talking about the fact that it has to send a Slack alert, but it's not explicit anywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading further, this turns out to not be the case - the human element is actually manually fixing things in GSheets. It would be worth clarifying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have removed this element and discussed the gsheets element later in greater detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A general comment: throughout the article you switch back and forth from "we" to "I". Since this was a solo project, and it's not a tutorial so much as a writeup of your learnings, I'd suggest standardising on "I".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have a bit of context of what transformations you're running here. How does using dbt map back to your original goals of deduplicating the property listings? Does it catch most of the issues for you, leaving you to only have to manually resolve the outliers? A screenshot of the DAG from dbt docs serve might help to contextualise everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You say "could make", but it looks like you did make these improvements, right? So I don't think this section is necessary as currently laid out.
If you wanted to talk about the Slack notification setup you created, that could make sense in a different context. E.g. you could talk about dlt's extensibility and use this as an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point I've changed this around to mention the addition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's a one-line fix in a config file, it would be nice to either post the line of code here or link to whatever documentation/Stack Overflow post you used to solve your problem, as a pointer to whoever tries to replicate your work and has the same issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is a one line config file issue but this wasn't an issue that was able to be replicated. My fix was to explicitly state the object as GcpServiceAccountCredentials in the init.py file for the source. Honestly speaking I felt it overkill to go into details on this and would detract from the core article but adding provided balance to the positives. I would suggest either it stays as is or we remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On reflection I have briefly referenced this if people were to stumble into the same issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean you're using dlt to create an authenticated connection (presumably to the warehouse?), or that you're creating an authenticated connection to dlt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the connection is to the warehouse sorry if that comes across a little unclear. The destination parameter specifies the type of database to which the data will be loaded.