Skip to content

Conversation

@lemire
Copy link
Member

This is in relation with bug #47457 where we found that under Windows, there seems to be a problem when converting from UTF-8 to UTF-16 a JSON message generated by key presses in the REPL.

Tracing it back, I found evidence that in common cases, each time you hit a key in the Node REPL, a call to dispatchProtocolMessage is made, passing an UTF-16 JSON message. The message is converted to UTF-8 into the local variable raw_message, and then we call parseMessage which converts back raw_message into UTF-16 and calls parseJSON on it.

This PR avoids the conversion from UTF-8 back into UTF-16. This should save some processing that is currently done with each key press.

It is possible that it is a Chesterton's fence and that the complexity is somehow necessary. Nevertheless, if this PR is correct, it would be an interesting optimization for the REPL.

@nodejs-github-botnodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. inspector Issues and PRs related to the V8 inspector protocol needs-ci PRs that need a full CI run. labels Sep 22, 2023
@anonriganonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 22, 2023
@github-actionsgithub-actionsbot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 22, 2023
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@anonrig
Copy link
Member

cc @nodejs/inspector @nodejs/cpp-reviewers

@anonriganonrig added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Sep 25, 2023
@tniessentniessen removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Sep 25, 2023
@tniessen
Copy link
Member

Nit: the commit message does not follow our guidelines. Please change it to begin with an imperative verb after the subsystem prefix.

@eugeneo
Copy link
Contributor

I think at this point we can confidently say there will never be a binary message in inspector protocol 😀 (it is pretty much set in stone at this point).

Why not just delete protocol::StringUtil::parseMessage? Note it is generated so needs to be deleted in the template.

@eugeneo
Copy link
Contributor

PR message seems misleading. Proposed change does not seem to touch any UTF8/UTF16 shenanigans unless I am missing something.

@lemirelemire changed the title inspector: simplifies dispatchProtocolMessageinspector: simplify dispatchProtocolMessageSep 26, 2023
@lemire
Copy link
MemberAuthor

Nit: the commit message does not follow our guidelines. Please change it to begin with an imperative verb after the subsystem prefix.

I changed the commit and the PR name.

@lemire
Copy link
MemberAuthor

@eugeneo

Proposed change does not seem to touch any UTF8/UTF16 shenanigans unless I am missing something.

It does. And that's a key point. We are saving some non-trivial processing.

The current code calls StringUtil::parseMessage which calls StringUtil::ParseJSON(const std::string_view) which is like so...

std::unique_ptr<Value> parseJSON(const std::string_view string){if (string.empty()) returnnullptr; size_t expected_utf16_length = simdutf::utf16_length_from_utf8(string.data(), string.length()); MaybeStackBuffer<char16_t> buffer(expected_utf16_length); // simdutf::convert_utf8_to_utf16 returns zero in case of error.size_t utf16_length = simdutf::convert_utf8_to_utf16( string.data(), string.length(), buffer.out()); // We have that utf16_length == expected_utf16_length if and only// if the input was a valid UTF-8 string.if (utf16_length == 0) returnnullptr; // We had an invalid UTF-8 input.CHECK_EQ(expected_utf16_length, utf16_length); returnparseJSONCharacters(reinterpret_cast<constuint16_t*>(buffer.out()), utf16_length)}

Observe how it converts to UTF-16.

Instead, we call directly StringUtil::parseJSON(v8_inspector::StringView string)...

std::unique_ptr<Value> parseJSON(v8_inspector::StringView string){if (string.length() == 0) returnnullptr; if (string.is8Bit()) returnparseJSONCharacters(string.characters8(), string.length()); returnparseJSONCharacters(string.characters16(), string.length())}

See how we are bypassing simdutf::convert_utf8_to_utf16?

We are possibly skipping some allocation and some transcoding...

It is not going to make a huge difference unless you are typing really fast on your keyboard, but... code simplification is always a positive, I would think.

@lemire
Copy link
MemberAuthor

Why not just delete protocol::StringUtil::parseMessage?

Yes. Let me try.

@lemire
Copy link
MemberAuthor

@eugeneo Last commit removes parseMessage entirely since it is seemingly not needed.

@anonriganonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 26, 2023
@github-actionsgithub-actionsbot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 26, 2023
@nodejs-github-bot
Copy link
Collaborator

@lemire
Copy link
MemberAuthor

@addaleax@alexkozy Can you help?

This PR get the following issue under macOS in CI:

=== release test-inspector-wait-for-connection === Path: parallel/test-inspector-wait-for-connection Error: --- stderr --- Timed out waiting for matching notification (Console output matching "after wait for debugger") 

If it is indeed an issue introduced by the current PR, I would like to be able to reproduce it.

Running the script with Node 20 (public release), I get...

$ node test/parallel/test-inspector-wait-for-connection.js [test] Connecting to a child Node process [test] Testing /json/list [err] Debugger listening on ws://127.0.0.1:58989/8095cee9-4dfb-4cd4-a887-3171664981a3 [err] For help, see: https://nodejs.org/en/docs/inspector [err] [out] before wait for debugger [out] [err] Debugger attached. [err] [out] after wait for debugger [out] [test] Connecting to a child Node process [test] Testing /json/list [out] before second wait for debugger [out] [err] Debugger attached. [err] [out] after second wait for debugger [out] [err] Debugger ending on ws://127.0.0.1:58989/8095cee9-4dfb-4cd4-a887-3171664981a3 [err] For help, see: https://nodejs.org/en/docs/inspector [err] $ echo $? 0 

On my macBook, building the software, I get...

$ ./out/Debug/node test/parallel/test-inspector-wait-for-connection.js [test] Connecting to a child Node process [test] Testing /json/list [err] Debugger listening on ws://127.0.0.1:58995/76517d6d-976c-435c-a8fe-b912eb92f258 [err] For help, see: https://nodejs.org/en/docs/inspector [err] [out] before wait for debugger [out] [err] Debugger attached. [err] [out] after wait for debugger [out] [out] before second wait for debugger [out] [test] Connecting to a child Node process [test] Testing /json/list [err] Debugger attached. [err] [out] after second wait for debugger [out] [err] Debugger ending on ws://127.0.0.1:58995/76517d6d-976c-435c-a8fe-b912eb92f258 [err] For help, see: https://nodejs.org/en/docs/inspector [err] $ echo $? 0 

and

$ ./out/Release/node test/parallel/test-inspector-wait-for-connection.js[test] Connecting to a child Node process [test] Testing /json/list [err] Debugger listening on ws://127.0.0.1:59000/9c8cb27d-b6ec-406e-9e1e-bb1678769d48 [err] For help, see: https://nodejs.org/en/docs/inspector [err] [out] before wait for debugger [out] [err] Debugger attached. [err] [out] after wait for debugger [out] [test] Connecting to a child Node process [test] Testing /json/list [out] before second wait for debugger [out] [err] Debugger attached. [err] [out] after second wait for debugger [out] [err] Debugger ending on ws://127.0.0.1:59000/9c8cb27d-b6ec-406e-9e1e-bb1678769d48 [err] For help, see: https://nodejs.org/en/docs/inspector [err] $ echo $? 0 

So everything is fine, as far as I can see.

This code is built with the patch from the current PR, e.g., we have

voiddispatchProtocolMessage(const StringView& message){std::string raw_message = protocol::StringUtil::StringViewToUtf8(message); per_process::Debug(DebugCategory::INSPECTOR_SERVER, "[inspector received] %s\n", raw_message); std::unique_ptr<protocol::DictionaryValue> value = protocol::DictionaryValue::cast( protocol::StringUtil::parseJSON(message));

@lemire
Copy link
MemberAuthor

lemire commented Sep 27, 2023

The macOS test seems to pass now. :-/

@anonriganonrig added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. commit-queue Add this label to land a pull request using GitHub Actions. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. labels Sep 27, 2023
@nodejs-github-botnodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Sep 30, 2023
@nodejs-github-botnodejs-github-bot merged commit 2990390 into nodejs:mainSep 30, 2023
@nodejs-github-bot
Copy link
Collaborator

Landed in 2990390

GeoffreyBooth pushed a commit to GeoffreyBooth/node that referenced this pull request Oct 1, 2023
alexfernandez pushed a commit to alexfernandez/node that referenced this pull request Nov 1, 2023
targos pushed a commit that referenced this pull request Nov 11, 2023
@targostargos mentioned this pull request Nov 12, 2023
debadree25 pushed a commit to debadree25/node that referenced this pull request Apr 15, 2024
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author readyPRs that have at least one approval, no pending requests for changes, and a CI started.c++Issues and PRs that require attention from people who are familiar with C++.commit-queue-squashAdd this label to instruct the Commit Queue to squash all the PR commits into the first one.inspectorIssues and PRs related to the V8 inspector protocolneeds-ciPRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

@lemire@nodejs-github-bot@anonrig@tniessen@eugeneo