Skip to content

Conversation

@Linkgoron
Copy link
Contributor

@LinkgoronLinkgoron commented May 4, 2021

Add a non-destroying async iterator to Readable.

fixes: #38491

@nodejs/streams

A few things that I think might need attention:

  1. The API itself. Is a new method needed and is the naming OK, or should options be added to Symbol.asyncIterator?
  2. Should the new method receive different defaults than Symbol.asyncIterator?
  3. Maybe unrelated to this PR, should the createAsyncIterator method remove the listeners that it adds after the iteration ends? Currently it does not, but this was done when it essentially always destroyed the stream. Now it does not (although, it would be a bit problematic with Error, as it emits an error next-tick if an error was thrown). I'm not sure if it's that bad, as the listeners are mostly noops anyway.

add a non-destroying iterator to Readable fixes: nodejs#38491
@github-actionsgithub-actionsbot added the needs-ci PRs that need a full CI run. label May 4, 2021
@LinkgoronLinkgoron added the stream Issues and PRs related to the stream subsystem. label May 4, 2021
@nodejs-github-bot
Copy link
Collaborator

Copy link
Member

@mcollinamcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work! I've left a few notes.

has less then 64KB of data because no `highWaterMark` option is provided to
[`fs.createReadStream()`][].

##### `readable.iterator([options])`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought: does this have to be in a new method rather than adding a parameter to the existing Symbol.asyncIterator one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm -0 on naming. I prefer a new method as the Symbol.asyncIterator one has a predefined signature by the standard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't feel right to me if the user has to call [Symbol.asyncIterator]() themselves.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Symbol.asyncIterator one has a predefined signature by the standard.

Not really, all the standard says it this method is called with no argument (https://tc39.es/ecma262/#sec-getiterator). The standard gives a clear rule for the returned object (https://tc39.es/ecma262/#sec-asynciterable-interface), but not for the function signature. I personally don't feel strongly either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer the separate method. The key issue with extending standard-defined APIs is that it makes reasoning about the portability of code far more complicated. A separate method makes it clear. That said, the behavior of the two can be identical such that [Symbol.asyncIterator]() could just defer to readable.iterator() with default arguments.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, the behavior of the two can be identical such that [Symbol.asyncIterator]() could just defer to readable.iterator() with default arguments.

Yeah, the reason that one doesn't call the other is because of legacy streams and this. I'd need to use ReflectApply to bind this, and I preferred to have a regular method and send this as the first parameter instead of primordials.

@nodejs-github-bot
Copy link
Collaborator

Copy link
Member

@mcollinamcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nodejs-github-bot
Copy link
Collaborator

@benjamingr
Copy link
Member

I'm a bit torn about this because I think there is a good chance we picked the wrong default for return on streams. I think pretty often people will need non-destructive iterators and for those cases having to write stream.iterator({destroyOnReturn: false, destoryOnError:false }) isn't super ergonomic

@mcollina
Copy link
Member

I think the defaults are currently sound as the developer would need to do something to destroy the stream manually. The current defaults are safe.

@benjamingr
Copy link
Member

So here are the two use cases I have:

forawait(constchunkoffs.createReadStream('./foo')){// this should not leak}forawait(constchunkofsomeMethodReturningAStream()){// this should not leak}

On the other hand, I want the following to also work:

conststream=fs.createReadSream('./someFile');forawait(constchunkofstream){if(isSpecial(chunk))break;processFirst(chunk);// e.g. read http headers}forawait(constchunkofstream){// continues from where the last for await endedif(isOtherSpecial(chunk))break;processSecond(chunk);// e.g. read http body}

@Linkgoron
Copy link
ContributorAuthor

Linkgoron commented May 6, 2021

So here are the two use cases I have:
On the other hand, I want the following to also work:

conststream=fs.createReadStream('./someFile');forawait(constchunkofstream){if(isSpecial(chunk))break;processFirst(chunk);// e.g. read http headers}forawait(constchunkofstream){// continues from where the last for await endedif(isOtherSpecial(chunk))break;processSecond(chunk);// e.g. read http body}

One option would be to add a setIterationMode on Readable, where you could set the default type of iterator that Symbol.asyncIterator would return. Other options include stuff like having another alias for nonDestructive iterators (not a fan of having tons of aliases for the same method) or having iterator return with different defaults (which I don't think has much support here).

setIterationMode would look something like this:

conststream=fs.createReadStream('./someFile');stream.setIterationMode({destroyOnReturn: false,destoryOnError:false});forawait(constchunkofstream){if(isSpecial(chunk))break;processFirst(chunk);// e.g. read http headers}forawait(constchunkofstream){// continues from where the last for await endedif(isOtherSpecial(chunk))break;processSecond(chunk);// e.g. read http body}

@mcollina
Copy link
Member

@benjamingr#38526 (comment) requirements are mutually exclusive.

@benjamingr
Copy link
Member

@benjamingr#38526 (comment) requirements are mutually exclusive.

That's sort of my point - it's the most concise way I could describe the problem. I'm not sure .iterator is the best way to deal with it. I'm wondering if this should be on the stream rather than the iterator.

It's possible that this is the best we can do - I just think it's a difficult problem and I want to make sure we're not exploring too few options here.

@nodejs-github-bot
Copy link
Collaborator

@mcollina
Copy link
Member

How could it be on the stream?

@nodejs-github-bot
Copy link
Collaborator

@Linkgoron
Copy link
ContributorAuthor

@benjamingr Do you have any outstanding objections to this getting merged?

@benjamingr
Copy link
Member

Nope, just uncertainty 😅

@benjamingr
Copy link
Member

I'd be more comfortable if this was experimental but I'm fine with this landing as stable.

@mcollina
Copy link
Member

I'd be happy to land it doc-experimental if you prefer @benjamingr? No warnings and we do not backport.

@mcollinamcollina added the baking-for-lts PRs that need to wait before landing in a LTS release. label May 19, 2021
@benjamingr
Copy link
Member

I'd be happy to land it doc-experimental if you prefer @benjamingr?

doc-experimental is good to me. If others feel strongly that this is the right API I'd also happily concede.

@mcollina
Copy link
Member

@Linkgoron can you add the experimental badge in there?

@Linkgoron
Copy link
ContributorAuthor

Added the experimental tag in the method docs

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot
Copy link
Collaborator

nodejs-github-bot commented May 24, 2021

CI: https://ci.nodejs.org/job/node-test-pull-request/38316/ 💚

@LinkgoronLinkgoron added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label May 25, 2021
@jasnell
Copy link
Member

Landed in df85d37

@jasnelljasnell closed this May 25, 2021
jasnell pushed a commit that referenced this pull request May 25, 2021
add a non-destroying iterator to Readable fixes: #38491 PR-URL: #38526Fixes: #38491 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Robert Nagy <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]>
danielleadams pushed a commit that referenced this pull request May 31, 2021
add a non-destroying iterator to Readable fixes: #38491 PR-URL: #38526Fixes: #38491 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Robert Nagy <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]>
@danielleadamsdanielleadams mentioned this pull request May 31, 2021
@targostargos removed the baking-for-lts PRs that need to wait before landing in a LTS release. label Sep 16, 2022
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author readyPRs that have at least one approval, no pending requests for changes, and a CI started.needs-ciPRs that need a full CI run.streamIssues and PRs related to the stream subsystem.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add non-destroying AsyncIterator to Readable streams

9 participants

@Linkgoron@nodejs-github-bot@benjamingr@mcollina@jasnell@Trott@targos@ronag@aduh95