Skip to content

Conversation

@Trott
Copy link
Member

Improve accuracy of the error message when a hex string of an incorrect
length is send to .write() or .fill().

Fixes: #3770

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

buffer

@TrottTrott added buffer Issues and PRs related to the buffer subsystem. semver-major PRs that contain breaking changes and should be released in the next major version. labels Mar 23, 2017
@TrottTrott added this to the 8.0.0 milestone Mar 23, 2017
@nodejs-github-botnodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. labels Mar 23, 2017
@Trott
Copy link
MemberAuthor

Rather than attempting to adjust the behavior of our code to match the error message (as in #3773) this takes the approach of adjusting the error message to more accurately report the issue that is found. (Of course, we can always implement more complete and robust error checking like in #3773 at a later date if deemed desirable and performant.)

@Trott
Copy link
MemberAuthor

Copy link
Member

@targostargos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a RangeError is appropriate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comment should be updated

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can be a bit more explicit about it? Something like String length must be a multiple of 2 if encoding is "hex"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps hex string length must be a multiple of 2

Copy link
MemberAuthor

@TrottTrottMar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to think this check isn't really worthwhile and the right thing to do is remove it. This error is still misleading in that if you get it, one would reasonably expect there's more validation going on for hex strings than just a length check. One would reasonably expect that we check for valid hex chars. But we don't. We happily allow out-of-range characters just as long as the string length is divisible by two.

The upshot is puzzling behavior like this:

> Buffer.from('xx', 'hex')<Buffer > > Buffer.from('x', 'hex')TypeError: Invalid hex string at Buffer.write (buffer.js:769:21) at fromString (buffer.js:213:18) at Function.Buffer.from (buffer.js:105:12) at repl:1:8 at ContextifyScript.Script.runInThisContext (vm.js:23:33) at REPLServer.defaultEval (repl.js:339:29) at bound (domain.js:280:14) at REPLServer.runBound [as eval] (domain.js:293:12) at REPLServer.onLine (repl.js:536:10) at emitOne (events.js:101:20) > Buffer.from('abxx', 'hex')<Buffer ab> > Buffer.from('abx', 'hex')TypeError: Invalid hex string at Buffer.write (buffer.js:769:21) at fromString (buffer.js:213:18) at Function.Buffer.from (buffer.js:105:12) at repl:1:8 at ContextifyScript.Script.runInThisContext (vm.js:23:33) at REPLServer.defaultEval (repl.js:339:29) at bound (domain.js:280:14) at REPLServer.runBound [as eval] (domain.js:293:12) at REPLServer.onLine (repl.js:536:10) at emitOne (events.js:101:20) > 

It would be a whole lot less puzzling if this happened:

> Buffer.from('xx', 'hex')<Buffer > > Buffer.from('x', 'hex')<Buffer > > Buffer.from('abxx', 'hex')<Buffer ab> > Buffer.from('abx', 'hex')<Buffer ab> >

Principle Of Least Astonishment to me says we either accept anything and do the best we can or else we do reasonably rigorous error checking.

@Trott
Copy link
MemberAuthor

Maybe a RangeError is appropriate?

If I got a RangeError, I would expect it to be something like the string contained characters that were invalid hex digits (/[^a-fA-F0-9]/). This error we're talking about here seems more like "invalid formatting" than "out of acceptable range" to me...

@Trott
Copy link
MemberAuthor

Reprinting an above comment down here for more visibility:

I'm starting to think this check isn't really worthwhile and the right thing to do is remove it. This error is still misleading in that if you get it, one would reasonably expect there's more validation going on for hex strings than just a length check. One would reasonably expect that we check for valid hex chars. But we don't. We happily allow out-of-range characters just as long as the string length is divisible by two.

The upshot is puzzling behavior like this:

> Buffer.from('xx', 'hex')<Buffer > > Buffer.from('x', 'hex')TypeError: Invalid hex string at Buffer.write (buffer.js:769:21) at fromString (buffer.js:213:18) at Function.Buffer.from (buffer.js:105:12) at repl:1:8 at ContextifyScript.Script.runInThisContext (vm.js:23:33) at REPLServer.defaultEval (repl.js:339:29) at bound (domain.js:280:14) at REPLServer.runBound [as eval] (domain.js:293:12) at REPLServer.onLine (repl.js:536:10) at emitOne (events.js:101:20) > Buffer.from('abxx', 'hex')<Buffer ab> > Buffer.from('abx', 'hex')TypeError: Invalid hex string at Buffer.write (buffer.js:769:21) at fromString (buffer.js:213:18) at Function.Buffer.from (buffer.js:105:12) at repl:1:8 at ContextifyScript.Script.runInThisContext (vm.js:23:33) at REPLServer.defaultEval (repl.js:339:29) at bound (domain.js:280:14) at REPLServer.runBound [as eval] (domain.js:293:12) at REPLServer.onLine (repl.js:536:10) at emitOne (events.js:101:20) > 

It would be a whole lot less puzzling if this happened:

> Buffer.from('xx', 'hex')<Buffer > > Buffer.from('x', 'hex')<Buffer > > Buffer.from('abxx', 'hex')<Buffer ab> > Buffer.from('abx', 'hex')<Buffer ab> >

Principle Of Least Astonishment to me says we either accept anything and do the best we can or else we do reasonably rigorous error checking.

@addaleax
Copy link
Member

accept anything and do the best we can

That’s mostly what we do for base64, so doing it for hex too sounds good to me

@Trott
Copy link
MemberAuthor

That’s mostly what we do for base64, so doing it for hex too sounds good to me

Done! PTAL

@TrottTrott changed the title buffer: better error for malformated hex stringbuffer: remove error for malformated hex stringMar 25, 2017
@TrottTrott changed the title buffer: remove error for malformated hex stringbuffer: remove error for malformatted hex stringMar 25, 2017
@addaleax
Copy link
Member

@targos@cjihrig You probably want to re-review too now that this PR does something quite different :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test where the resulting length is >0 ?
Like Buffer.from('abx', 'hex')

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@targos Good idea. Added!

Remove error message when a hex string of an incorrect length is sent to .write() or .fill(). Fixes: nodejs#3770
@jgeewax
Copy link

(Just chiming in as this relates to #4877)

It's clear that there's precedent here with base64 (it clearly just swallows any invalid characters):

>Buffer.from('not !**# base64 at all_... !!!!','base64').toString('base64')'notbase64atall//'

Would it make sense to also add "strict mode" where invalid values are rejected?

In other words, I'd guess there is a not-insignificant group of folks out there who would prefer to be told when garbage goes into creating a new Buffer (since they're saying what the format should be). A strict mode wouldn't hurt those of us who just want the buffer to do "just do the best it can", and is super helpful to those of us that want to be notified when we accidentally send invalid values into the Buffer.

A few options that seem reasonable:

  • Flags like RegExp: Buffer.from(input, 'hex', 's')
  • Boolean for strict-mode: Buffer.from(input, 'base64', true)
  • A separate encoding type: Buffer.from(input, 'hex-strict')

Not sure if it matters also, but short-cut validation methods that check whether the Buffer.from(input, 'base64').toString('base64') == input could be useful (this might be out of scope of the Buffer class though...)

@Trott
Copy link
MemberAuthor

Trott commented Mar 26, 2017

@jgeewax I definitely see that sort of error-checking safety as useful behavior and while I wish it was what Node.js did, it does seem like there's no reason that validating input like that can't be implemented as a userland module. EDIT: And if a userland module can show a way to do it in a performant way, that might even be a path into Node.js core. My sense right now is that most of the objections/hesitance about expanding the error checking is the performance hit incurred.

@addaleax
Copy link
Member

@jgeewax If we were designing Node from scratch, we’d probably implement one of your suggestions (which all make sense from an API perspective), but I think they all come with some kind of negative performance impact that we would want to avoid now.

@Trott
Copy link
MemberAuthor

@Trott
Copy link
MemberAuthor

Landed in 682573c

@TrottTrott closed this Mar 28, 2017
Trott added a commit to Trott/io.js that referenced this pull request Mar 28, 2017
Remove error message when a hex string of an incorrect length is sent to .write() or .fill(). PR-URL: nodejs#12012Fixes: nodejs#3770 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Michaël Zasso <[email protected]> Reviewed-By: Colin Ihrig <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Sakthipriyan Vairamani <[email protected]>
@jasnelljasnell mentioned this pull request Apr 4, 2017
leonardodino added a commit to leonardodino/basic-crypto that referenced this pull request Oct 31, 2017
buffer behaviour change caused by: nodejs/node#12012 landed in: nodejs/node@682573c11
@TrottTrott deleted the hex-fix branch January 13, 2022 22:45
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bufferIssues and PRs related to the buffer subsystem.c++Issues and PRs that require attention from people who are familiar with C++.semver-majorPRs that contain breaking changes and should be released in the next major version.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

buffer: inconsistent string validation

10 participants

@Trott@addaleax@jgeewax@jasnell@thefourtheye@lpinca@targos@cjihrig@joyeecheung@nodejs-github-bot