Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple query example #141

Open
matthewcummings opened this issue Sep 1, 2020 · 8 comments
Open

Simple query example #141

matthewcummings opened this issue Sep 1, 2020 · 8 comments

Comments

@matthewcummings
Copy link

Could you add a simple example showing how to run a query? I ended up going with the AWS SDK for now because I couldn't figure out the proper way to set up the query using your library.

@guregu
Copy link
Owner

guregu commented Sep 8, 2020

I agree we need more examples. The godoc could use some love for sure. Also want to expand the readme.
There's some if you look at the tests, but I wouldn't call them simple: https://github.com/guregu/dynamo/blob/master/query_test.go

@matthewcummings
Copy link
Author

I saw those, I ended up just using the AWS SDK directly. I couldn't figure out how to query based on a partition key and a range key together.

@guregu
Copy link
Owner

guregu commented Sep 8, 2020

OK. I'll write an example in case someone stumbles upon this looking for help.
Let's say we have this data in a table called "Events".

UserID (partition key, number) Date (sort key, string) Comment (extra data, not a primary key)
1 2020-02-10 hello
1 2020-02-11 howdy
1 2020-02-12 sup
1 2020-02-20
2 2020-01-10
2 2020-05-01

This library uses the old term for partition key, which is "hash key". The sort key is a "range key".

We can model our data like this:

type Event struct {
	UserID  int
	Date    string
	Comment string
}

Initialize the DB and specify the table (the README.md contains a full example for dynamo.New):

db := dynamo.New( /*...*/ )
table := db.Table("Events")

If we want to grab the first row in this table, we can do this, which will use a GetItem API call.

var event Event
// get UserID=1 and Date=2020-02-10
err := table.Get("UserID", 1).Range("Date", dynamo.Equal, "2020-02-10").One(&event)
fmt.Println(event) // prints Event{UserID: 1, Date: 2020-02-10, Comment: hello}

Let's do a query for of UserID=1's items between 2020-02-10 and 2020-02-12. This will use a Query API call.

var events []Event
err := table.Get("UserID", 1).Range("Date", dynamo.Between, "2020-02-10", "2020-02-12").All(&events)
fmt.Println(events) // got 3 items for UserID=1: 2020-02-10, 2020-02-11, and 2020-02-12

Get all of UserID=2's events:

var events []Event
err := table.Get("UserID", 2).All(&events)
fmt.Println(events) // got 2 items for UserID=2: 2020-01-10, 2020-05-01

You can use Filter to further refine your search. This gets all of UserID=1's events whose comment starts with 'h'.

var events []Event
err := table.Get("UserID", 1).Filter("begins_with('Comment', ?)", "h").All(&events)
fmt.Println(events) // got 2 items for UserID=1: 2020-02-11 (hello), 2020-02-11 (howdy)

You can use anything from the reference here.

That's the basic idea, the rest of the library follows pretty much the same API. For example, Scan has the same filter syntax, etc.
Hope this helps. I'll try and add this to the README/Godoc soon.

@cmawhorter
Copy link

sometimes datasets can be large and you need to page through each individually.

i'm new to go and might might be missing something obvious, but All() doesn't work in these situations.

fwiw this is what i came up with:

	var offset dynamo.PagingKey
	var resultIds []string
	for {
		itr := s.Table.Scan().
			Filter("begins_with($, ?)", "UserId", prefix).
			StartFrom(offset).
			Iter()
		if itr.Err() != nil {
			// handle error
		}
		for {
			var record Record
			more := itr.Next(&record)
			if !more {
				break
			}
			results = append(resultIds, record.PrimaryKey)
		}
		// no more pages to load. could itr.HasMore() exist?
		if itr.LastEvaluatedKey() == nil {
			break
		}
	}

is there something that wraps that up? maybe all works this way now somehow and manages LastEvaluatedKey and StartFrom internally?

@guregu
Copy link
Owner

guregu commented Nov 20, 2020

@cmawhorter
All() and Iter() should still work with large datasets.
Internally, All() just calls Iter() and runs Next until it runs out of results. And Iter already automatically uses the LastEvaluatedKey.

I'm not sure what trouble you're facing, but maybe you're timing out? If you check iter.Err() it should tell you. You can change the default timeout or use the WithContext methods and control timeouts with context instead. A RetryTimeout of 0 or context.Background() will never timeout.
Iter will call itself over and over until there are results (ref). If you're using a filter that only returns a small amount of results in a huge dataset it's possible that it's timing out while doing this.

LastEvaluatedKey is mostly for paginating results when you can't use a range key to do it. For example if you wanted an API to only return at most 100 items at a time, you could use SearchLimit and LastEvaluatedKey.

If you're not getting an error from Iter but it's still not returning all the results then it's a bug and I'd like to fix it.

@guregu
Copy link
Owner

guregu commented Nov 20, 2020

@cmawhorter
To be more specific: you need to call itr.Err() after you finish looping with itr.Next(), not before it. That should give you an error to work with.

@cmawhorter
Copy link

i mistakingly assumed dynamo was like some other ddb libs and it buffered the entire set when calling All() before returning. but since you're calling next instead of returning results, it can know when to grab the next page when necessary. that makes sense. i just reinvented the wheel above in my snippet. 🤦

@alin-simionoiuDE
Copy link

@pquerna how would you write a range with an IN filter? using your example above le's assuming you want to return all rows where Comment is hello and howdy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants