Skip to content

Conversation

@buger
Copy link
Owner

This is ongoing PR but i wanted to show what i'm working on right now.

There is 2 major changes here:

  • Support for querying multiple keys at once
  • Struct unmarshmaling, same way as encoding/json works (use feature above)

I added new API function KeyOffset which returns offsets in base data structure for all keys at once, so it can do less reads.

Here is example taken from benchmark:

r := smallFixture offsets := jsonparser.KeyOffsets(r, []string{"uuid"}, []string{"tz"}, []string{"ua"}, []string{"st"}, ) jsonparser.Get(r[offsets[0]:]) jsonparser.GetInt(r[offsets[1]:]) jsonparser.Get(r[offsets[2]:]) jsonparser.GetInt(r[offsets[3]:]) 

As you can see from UX perspective it looks way worse, but it can give significant speedup, see benchmarks below. As future improvement i wanted implement iterator like access, similar to ArrayEach which reduce number of reads even more.

Based on feature above, i implemented basic version of Unmarshaler, you can check the code and bechmarks (looks not bad, considering that it does not use code generation). As unique feature i plan add support for nested keys, so you can specify them like:

struct Person{githubFollowers int `json:"person.github.followers_count"` ... } 

Here is initial benchmarks:

BenchmarkJsonParserLarge 100000 75014 ns/op 32 B/op 2 allocs/op BenchmarkJsonParserLargeOffsets 100000 81327 ns/op 528 B/op 33 allocs/op BenchmarkJsonParserMedium 500000 15577 ns/op 48 B/op 3 allocs/op BenchmarkJsonParserMediumOffsets 1000000 7757 ns/op 80 B/op 4 allocs/op BenchmarkJsonParserSmall 5000000 1289 ns/op 0 B/op 0 allocs/op BenchmarkJsonParserSmallOffsets 10000000 1065 ns/op 32 B/op 1 allocs/op BenchmarkJsonParserSmallStruct 2000000 3280 ns/op 384 B/op 11 allocs/op 

KeyOffsets can be optimized much more in my view.

@buger
Copy link
OwnerAuthor

@daboyuka
Copy link
Contributor

This looks really cool. Thanks for keeping the offset-computing code separate so as to not slow down the simpler searchKeys case (this is nice for me since I'm not sure the multi-key search will be needed for my use case).

@daboyuka
Copy link
Contributor

What if you made a wrapper around the core functions for a more user-friendly experience? E.g.:

lookup := NewMultiKeyLookup(data) refs := lookup.MultiGet( []string{"a"}, []string{"b"}, []string{"c"}, ) val1 := lookup.Get(ref[0]) val2 := lookup.GetInt(ref[1]) val3 := lookup.Get(ref[2]) 

The refs returned by MultiGet would probably just be the offset to the value, but could be typedefed to be something meaningful (valueref or something).

The idea is to hide the need for slicing the data and calling Get() for each offset lookup inside the API.

Just thinking out loud, not sure this is actually a good idea :).

@buger
Copy link
OwnerAuthor

Yeah, this could be a good idea, and we can hide caching inside.

@buger
Copy link
OwnerAuthor

Well actually syntax above is not supported, as Go do not support variable list of returned arguments

@daboyuka
Copy link
Contributor

Doh, you're right, I guess I meant refs := and then refs[0], etc. Edited.

@bugerbuger mentioned this pull request Apr 17, 2016
@Allendar
Copy link

Allendar commented Jun 10, 2016

I want to test this soon with some big data-sets and will supply the results here. Just by mere conversion from ffjson for inspections on JSON on multiple of my API's I'm seeing very reliable 9,21 (1k entries) to 9,38 (100k entries) times more speed and way lower footprint. Can't wait to try this out next! Thanks so much for this amazing contribution!

Update: Maybe there's also focus points here. These are the hotspots on my pprof:

(pprof) top5 1060ms of 1260ms total (84.13%) Showing top 5 nodes out of 48 (cum >= 110ms) flat flat% sum% cum cum% 470ms 37.30% 37.30% 470ms 37.30% github.com/buger/jsonparser.stringEnd 210ms 16.67% 53.97% 210ms 16.67% runtime.usleep 150ms 11.90% 65.87% 460ms 36.51% github.com/buger/jsonparser.blockEnd 140ms 11.11% 76.98% 320ms 25.40% github.com/buger/jsonparser.searchKeys 

@buger
Copy link
OwnerAuthor

@Allendar also check this change that was merged yesterday #45

Also, I'll be frank, this PR is not ready yet and supports only simple structs.

@ryanleary
Copy link

@buger Is work continuing on this effort?

@Villenny
Copy link
Contributor

Doh, you're right, I guess I meant refs := and then refs[0], etc. Edited.

Doh, you're right, I guess I meant refs := and then refs[0], etc. Edited.

Have you considered some variation on:

type ValueOffset int type ValueOffsets []ValueOffset lookup := jsonparser.ParseKeys(make([]ValueOffset, 3)[:], []string{"a"}, []string{"b"}, []string{"c"}, ) val1 := lookup.Get(0) val2 := lookup.GetInt(1) val3 := lookup.GetUnsafeString(2) 

Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

@buger@daboyuka@Allendar@ryanleary@Villenny