Skip to content

Conversation

@Code-Hex
Copy link
Contributor

@Code-HexCode-Hex commented May 26, 2020

Description

When the time information is passed in byteslice, it takes a long time to parse it. The order in which the current parsing is done is

  1. cast dest[i] to byteslice
  2. cast byteslice to string
  3. parseDateTime

And since the format for returning the time information to the client has been decided, I think it's obviously faster to do the parsing ownself right away.

スクリーンショット 2020-05-26 11 47 23

The benchmark results

$ go test -benchmem . -bench "(DateTime|Cast)$" goos: darwin goarch: amd64 pkg: github.com/go-sql-driver/mysql BenchmarkParseDateTime-4 5039470 250 ns/op 0 B/op 0 allocs/op BenchmarkParseByteDateTime-4 11188809 99.2 ns/op 0 B/op 0 allocs/op BenchmarkParseByteDateTimeStringCast-4 3342747 356 ns/op 32 B/op 1 allocs/op 
Code
funcdeprecatedParseDateTime(strstring, loc*time.Location) (t time.Time, errerror){constbase="0000-00-00 00:00:00.000000"switchlen(str){case10, 19, 21, 22, 23, 24, 25, 26: // up to "YYYY-MM-DD HH:MM:SS.MMMMMM"ifstr==base[:len(str)]{return } ifloc==time.UTC{returntime.Parse(timeFormat[:len(str)], str) } returntime.ParseInLocation(timeFormat[:len(str)], str, loc) default: err=fmt.Errorf("invalid time string: %s", str) return } } funcBenchmarkParseDateTime(b*testing.B){str:="2020-05-13 21:30:45"loc:=time.FixedZone("test", 8*60*60) b.ResetTimer() fori:=0; i<b.N; i++{_, _=deprecatedParseDateTime(str, loc) } } funcBenchmarkParseByteDateTime(b*testing.B){bStr:= []byte("2020-05-25 23:22:01.159491") loc:=time.FixedZone("test", 8*60*60) b.ResetTimer() fori:=0; i<b.N; i++{_, _=parseDateTime(bStr, loc) } } funcBenchmarkParseByteDateTimeStringCast(b*testing.B){bStr:= []byte("2020-05-25 23:22:01.159491") loc:=time.FixedZone("test", 8*60*60) b.ResetTimer() fori:=0; i<b.N; i++{_, _=deprecatedParseDateTime(string(bStr), loc) } }

MySQL datetime format
https://dev.mysql.com/doc/refman/8.0/en/datetime.html

Checklist

  • Code compiles correctly
  • Created tests which fail without the change (if possible)
  • All tests passing
  • Extended the README / documentation, if necessary
  • Added myself / the copyright holder to the AUTHORS file

The benchmark results $ go test -benchmem . -bench "^BenchmarkParseByte" goos: darwin goarch: amd64 pkg: github.com/go-sql-driver/mysql BenchmarkParseByteDateTime-4 12023173 104 ns/op 0 B/op 0 allocs/op BenchmarkParseByteDateTimeStringCast-4 3394239 355 ns/op 32 B/op 1 allocs/op
@methane
Copy link
Member

Can we just change the argument type of parseDateTime, instead of adding parseByteDateTime?

@Code-Hex
Copy link
ContributorAuthor

Code-Hex commented May 26, 2020

@methane Honestly, it bothers me.
I added a new one because there is already a case for handling strings. (I think there's a cost to casting to string to byte slice)

mysql/nulltime.go

Lines 34 to 38 in 8c3a2d9

casestring:
nt.Time, err=parseDateTime(v, time.UTC)
nt.Valid= (err==nil)
return
}

@methane
Copy link
Member

@Code-Hex It is deprecated function (See #960).
I think type conversion (allocation + copy) in the function is reasonable than two same implementation of parseDateTime.

@Code-Hex
Copy link
ContributorAuthor

@methane Thanks for your reply.
Sounds good to me. I fix it.

@Code-HexCode-Hex requested a review from methaneMay 27, 2020 02:43
utils_test.go Outdated
loc:=time.FixedZone("test", 8*60*60)
b.ResetTimer()
fori:=0; i<b.N; i++{
_, _=deprecatedParseDateTime(string(bStr), loc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please post the result of this benchmark and remove deprecatedParseDateTime.
We do not keep old implementations only for benchmark.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed 2c6aa27

Code-Hexand others added 4 commits May 28, 2020 16:42
@Code-HexCode-Hex requested a review from methaneMay 28, 2020 08:40
@chanxuehong
Copy link
Contributor

chanxuehong commented May 30, 2020

nice job!!!

I just thought that the parseDateTime can be optimized in the past few days, so I mentioned a PR #1117, and then I saw this implementation, it's interesting.

Here I have a few questions:

  1. Is it better to use bytes.Equal here???

    ifstring(b) ==base[:len(b)]{

  2. Is it really necessary to check that year(month, day) is equal to 0(in fact they will not be less than 0 here), we have not checked here if month is greater than 12, nor day is greater than 31

ifyear<=0{

ifm<=0{

ifday<=0{

@shogo82148
Copy link
Contributor

I have same question with chanxuehong about the range of year(month, day).
Why month 0 is treated as January?
In the current implementation, parseDateTime returns "month out of range" error with month 0.

https://play.golang.org/p/HgQkcLCOGMU

package main import ( "fmt""time" ) funcmain(){fmt.Println(time.Parse("2006-01-02 15:04:05.999999", "0000-00-00 00:00:00.000000")) fmt.Println(time.Parse("2006-01-02 15:04:05.999999", "9999-99-99 99:99:99.999999")) }
0001-01-01 00:00:00 +0000 UTC parsing time "0000-00-00 00:00:00.000000": month out of range 0001-01-01 00:00:00 +0000 UTC parsing time "9999-99-99 99:99:99.999999": month out of range 

I think the range checks are not needed, because MySQL returns correct datetime and we don't need to take care of incorrect datetime.
In addition, it is a little complex to calculate the correct range of the day. It varies depending on month and year (there are leap years!).

https://play.golang.org/p/0Z4QX-QtOxn

@Code-Hex
Copy link
ContributorAuthor

Code-Hex commented May 31, 2020

@chanxuehong@shogo82148 thanks for the mention to me.

  1. In my understanding, the maintainer wanted to take readability (I had implemented at utils: parse using byteslice in parseDateTime #1113 (comment)).

When I actually measured it, it wasn't much of a difference, so I thought it was good.

Code & Results
constbase="0000-00-00 00:00:00.000000"varbbase= []byte(base) funcBenchmarkByteStringEq(b*testing.B){fori:=0; i<b.N; i++{b:= []byte("0000-00-00 00:00:00.000000") _=string(b) ==base[:len(b)] } } funcBenchmarkByteEq(b*testing.B){fori:=0; i<b.N; i++{b:= []byte("0000-00-00 00:00:00.000000") _=bytes.Equal(b, bbase[:len(b)]) } }

Result (two times)

$ go test . -bench "BenchmarkByte" goos: darwin goarch: amd64 pkg: github.com/go-sql-driver/mysql BenchmarkByteStringEq-4 171111117 6.86 ns/op BenchmarkByteEq-4 177546639 6.86 ns/op PASS ok github.com/go-sql-driver/mysql 3.837s $ go test . -bench "BenchmarkByte" goos: darwin goarch: amd64 pkg: github.com/go-sql-driver/mysql BenchmarkByteStringEq-4 174836755 6.76 ns/op BenchmarkByteEq-4 168449960 6.94 ns/op PASS ok github.com/go-sql-driver/mysql 3.823s 
  1. I've been thinking a lot.

Assumptions, I basically believe that we can trust the values that MySQL passes. So I don't think it's really necessary to do a range check here.

The reason for checking here is so that time.Time can be treated as zero value even if the parsed result is less than or equal to zero value. I thought that if we could treat it as a zero value, it would be possible to handle it on the user side at worst.

https://play.golang.org/p/4iQJgoOGYIQ

But in this case, time.Parse can parse and not return an error.

https://play.golang.org/p/wRD2ylQiF_9


Anyway, I think it's okay to leave out year, day, month range confirmation logic. I'll fix it if the maintainer is agree.

@shogo82148
Copy link
Contributor

shogo82148 commented May 31, 2020

bytes.Equal(a, b []byte) is the same as string(a) == string(b). (I didn't know that!)

https://github.com/golang/go/blob/f1f8f9af9a55d73dfc6603a93bee0559fdc9024d/src/bytes/bytes.go#L18-L21

As the comment says, the Go compiler optimize this comparison.
So no need to use bytes.Equal for the performance.

time.Date(1, 1, 1, 0, 0, 0, 0, loc) is not the zero value if the loc is not time.UTC.

https://play.golang.org/p/T9-R0Bdpfzr

So it is not safe that users treat the zero value.


Anyway, my question doesn't block merging.

@shogo82148
Copy link
Contributor

Sorry...
It's my miss operation.

@julienschmidtjulienschmidt changed the title fixed the way of parsing datetime when byte slice stringutils: parse using byteslice in parseDateTimeMay 31, 2020
@julienschmidtjulienschmidt merged commit 12508c8 into go-sql-driver:masterMay 31, 2020
@Code-HexCode-Hex deleted the fix/parse-byte-time branch May 31, 2020 23:44
tz70s pushed a commit to tz70s/mysql that referenced this pull request Sep 5, 2020
* fixed the way of parsing datetime when byte slice string The benchmark results $ go test -benchmem . -bench "^BenchmarkParseByte" goos: darwin goarch: amd64 pkg: github.com/go-sql-driver/mysql BenchmarkParseByteDateTime-4 12023173 104 ns/op 0 B/op 0 allocs/op BenchmarkParseByteDateTimeStringCast-4 3394239 355 ns/op 32 B/op 1 allocs/op * added line to AUTHORS file * fixed error handling * fixed nanosec digits * added more tests for error * renamed parseByteDateTime to parseDateTime * reverted base null time * Update utils.go Co-authored-by: Inada Naoki <[email protected]> * Update utils.go Co-authored-by: Inada Naoki <[email protected]> * Update utils.go Co-authored-by: Inada Naoki <[email protected]> * removed deprecatedParseDateTime from test Co-authored-by: Inada Naoki <[email protected]>
tz70s pushed a commit to tz70s/mysql that referenced this pull request Sep 5, 2020
* fixed the way of parsing datetime when byte slice string The benchmark results $ go test -benchmem . -bench "^BenchmarkParseByte" goos: darwin goarch: amd64 pkg: github.com/go-sql-driver/mysql BenchmarkParseByteDateTime-4 12023173 104 ns/op 0 B/op 0 allocs/op BenchmarkParseByteDateTimeStringCast-4 3394239 355 ns/op 32 B/op 1 allocs/op * added line to AUTHORS file * fixed error handling * fixed nanosec digits * added more tests for error * renamed parseByteDateTime to parseDateTime * reverted base null time * Update utils.go Co-authored-by: Inada Naoki <[email protected]> * Update utils.go Co-authored-by: Inada Naoki <[email protected]> * Update utils.go Co-authored-by: Inada Naoki <[email protected]> * removed deprecatedParseDateTime from test Co-authored-by: Inada Naoki <[email protected]>
@methane
Copy link
Member

I think the range checks are not needed, because MySQL returns correct datetime and we don't need to take care of incorrect datetime.
In addition, it is a little complex to calculate the correct range of the day. It varies depending on month and year (there are leap years!).

You are right. These checks caused regression: #1252.

Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

@Code-Hex@methane@chanxuehong@shogo82148@julienschmidt