Skip to content

Conversation

@PopFlamingo
Copy link
Contributor

@PopFlamingoPopFlamingo commented Sep 13, 2019

This commit adds a basic connection pool implementation. It is still extremely WIP but includes breaking changes that must probably be taken care of before the 1.0.0 tag (see the comment below).

I'll add as many comments as possible tonight to explain parts that need to.

Let me know what you think! 🙂

There are many things to fix before the PR is truly ready for review/merge:

Making HTTPBin ready for pooling tests:

  • Add delayed responses (enables testing the creation of new connections by the connection providers when existing ones are in-use)
  • Add connections that close themselves after they served a request
  • Channels that close themselves after a certain delay (to test how the pool reacts when a stored channel closes)
  • Simulate connection delays (useful for performance testing of pooled vs non-pooled)

Misc issues/improvements

  • Fix issue where an existing ConnectionProvider will not add the proper SSL handler for https connections
  • Avoiding acquiring Locks for as long as its currently the case

Testing

  • Stress tests for the pool
  • Check the respect of the maximum number of concurrent connections for HTTP/1.1

Is this needed?

  • Make a performance comparator between pooled and non-pooled client

Not in this commit:
HTTP/2:

  • Add HTTP/2 support to HTTPBin
  • Finish and test the implementation of the HTTP/2 connection provider
  • Correct detection of HTTP/2 when its available

Copy link
Collaborator

@LukasaLukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general shape of this seems good, but I think we need a lot more commenting explaining what the strategy is here. There's a decent amount of code duplication too that could be cleaned up. However, it's a definite great start, well done!

@PopFlamingo
Copy link
ContributorAuthor

@Lukasa

The general shape of this seems good, but I think we need a lot more commenting explaining what the strategy is here. There's a decent amount of code duplication too that could be cleaned up. However, it's a definite great start, well done!

Thank you! I'll add comments to explain everything.

@Lukasa@t089 I will fix the parts where you noticed issues, thank you for your feedback! There is also quite a lot of code that needs to be cleaned.

The main goal of doing this draft PR right now is to be ensure breaking changes can be added before the 1.0.0 tag, even if actual pooling was only postponed for after the release. From what I remember of previous discussions, being able to add pooling entirely transparently in a SemVer minor version was something that would be nice to have, on this subject, I think breaking changes might need to be merged before the release tag.

I think #102 already adresses what's needed for thread safety issues and or the pool to have enough informations to make the better decisions for performance, but there is another part to consider: retried requests.
Since a connection pool augments the chances of failure due to connections being kept alive for longer, having an automatic retry strategy is needed. Note that connection "failure" for non-leased connections already has a basic minimal support in the current implementation: if a connection is closed, it is directly removed from the pool.
When the implementation becomes better, the TaskHandler will probably want to detect when an in-execution task fails and retry it automatically if possible as we had discussed with @Lukasa and @weissi. AFAICT this mechanism might need breaking changes to the HTTPClientResponseDelegate: when a request is re-tried, some delegates will need to reset their internal state, so a delegate method such as willRetryRequest(...) -> EventLoopFuture<Void> might be needed.

About this I have two ideas:

  • Make a breaking change before the 1.0.0 tag by requiring all delegate implementations to explicitly handle retries, this way a connection pool with retries can be added entirely transparently in the future.
  • Make no change to the current requirements of the delegate and add an an opt-in change in the future (by making a new type of delegate for retries or adding a default implementation): in this case, a connection pool could still be added, but no SemVer minor change could automatically add the retry mechanism for everyone except if the library user explicitly starts handling the retry.

I'm not really sure of what the best thing to do would be so I would appreciate to have your opinion about this, if needed I can also open it as a separate issue.

@Lukasa
Copy link
Collaborator

Make a breaking change before the 1.0.0 tag by requiring all delegate implementations to explicitly handle retries, this way a connection pool with retries can be added entirely transparently in the future.

I'm quite tempted by this approach, as it allows us to potentially shim a solution in by having an automatic-retry delegate that can be composed with a user-provided delegate in some way. In general, trying to do too much is likely to go badly for us. I think it may be a good idea to explicitly say that retries must be handled by delegates (and to add some API surface for doing so), as it allows us to punt on making the decision any further.

However, @weissi may be opposed to adding more API for this.

@PopFlamingo
Copy link
ContributorAuthor

PopFlamingo commented Sep 16, 2019

@Lukasa

I think it may be a good idea to explicitly say that retries must be handled by delegates (and to add some API surface for doing so)

Just to be sure of what you mean exactly by "handled by delegates": do you mean something similar to willRetryRequest(...)? Or do you mean an even more "manual" handling?

@PopFlamingo
Copy link
ContributorAuthor

PopFlamingo commented Sep 18, 2019

The 0d33412 commit adds new options (passed as Connection: close and X-internal-delay: [delayInMilliseconds] HTTP headers to control how requests are served (respectively: closing the connection after request is served and adding a delay to the HTTP response). There is also a new HTTPBin init option to control the initial connection delay of a client to the server, and a new option to make a connections (ie: Channels opened by the clients on the server) close after a certain delay.
Some new tests already take advantage of it, for instance testStressGetHttps and testResponseConnectionCloseGet.
The commit also fixes misc issues and documentation issues.

@weissi
Copy link
Contributor

I'm quite tempted by this approach, as it allows us to potentially shim a solution in by having an automatic-retry delegate that can be composed with a user-provided delegate in some way. In general, trying to do too much is likely to go badly for us. I think it may be a good idea to explicitly say that retries must be handled by delegates (and to add some API surface for doing so), as it allows us to punt on making the decision any further.

However, @weissi may be opposed to adding more API for this.

I think I'm down with that. I think we should make automatic retrying very easy for somebody to implement in their own delegate. That could work either with composition as @Lukasa proposed or possibly with a default implementation?

@kylebrowning
Copy link

kylebrowning commented Nov 15, 2019

Im trying to bring this back to light because it speeds up AsyncHTTPRequest considerably.

Without

SwiftNIO run time: 0.3538249731063843 SwiftNIO run time: 0.3536069393157959 SwiftNIO run time: 0.33445894718170166 SwiftNIO run time: 0.30274498462677 SwiftNIO run time: 0.3761049509048462 SwiftNIO run time: 0.5195209980010986 SwiftNIO run time: 0.36058497428894043 SwiftNIO run time: 0.29895198345184326 SwiftNIO run time: 0.3660769462585449 SwiftNIO run time: 0.29245102405548096 URLSession run time: 0.35767805576324463 URLSession run time: 0.3387099504470825 URLSession run time: 0.20786607265472412 URLSession run time: 0.33779799938201904 URLSession run time: 0.21909499168395996 URLSession run time: 0.210394024848938 URLSession run time: 0.2118690013885498 URLSession run time: 0.20524907112121582 URLSession run time: 0.20901107788085938 URLSession run time: 0.21499598026275635 

With Connection pooling

SwiftNIO run time: 0.33787500858306885 SwiftNIO run time: 0.1476989984512329 SwiftNIO run time: 0.20796406269073486 SwiftNIO run time: 0.2085040807723999 SwiftNIO run time: 0.20637094974517822 SwiftNIO run time: 0.15462100505828857 SwiftNIO run time: 0.13620400428771973 SwiftNIO run time: 0.14024698734283447 SwiftNIO run time: 0.13931310176849365 SwiftNIO run time: 0.2430809736251831 URLSession run time: 0.35739099979400635 URLSession run time: 0.26072800159454346 URLSession run time: 0.26291704177856445 URLSession run time: 0.26088404655456543 URLSession run time: 0.20262694358825684 URLSession run time: 0.19789695739746094 URLSession run time: 0.33044493198394775 URLSession run time: 0.2090669870376587 URLSession run time: 0.20859694480895996 URLSession run time: 0.21186602115631104 

And then I also ran it on an iOS device, 10 requests, this is a total combined time in release mode.

---no-pooling--- SwiftNIO run time: 1.8052639961242676 URLSession run time: 1.2042559385299683 ---pooling--- SwiftNIO run time: 1.191763997077942 URLSession run time: 1.1281760931015015 

I worked on the PR a bit, but cant seem to get the decompression test to pass.

@PopFlamingoPopFlamingo marked this pull request as ready for review November 16, 2019 00:39
@PopFlamingoPopFlamingo changed the title WIP: Prototype connection pool implementationAdd a connection poolNov 16, 2019
@PopFlamingo
Copy link
ContributorAuthor

PopFlamingo commented Nov 16, 2019

The PR is now ready for review! 🙂

cc @weissi@Lukasa@t089@tomerd@artemredkin

Thanks to the precious help of @kylebrowning this PR now includes the latest changes of master and all tests are passing, the benchmarks are really insightful too and let us ensure that these changes are beneficial to the performance.

What do you think about the connection pool implementation? What can be improved / made clearer?

Thank you!

A few things to check in all cases before merging:

  • Check there is no problematic reference cycle between connections and the pool
  • Remove HTTP/2 related pooling, this PR only focuses on HTTP/1.1

/// throw the appropriate error if needed. For instance, if its internal connection pool has any non-released connections,
/// this indicate shutdown was called too early before tasks were completed or explicitly canceled.
/// In general, setting this parameter to `true` should make it easier and faster to catch related programming errors.
publicfunc syncShutdown(requiresCleanClose:Bool)throws{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this as public API? I think this can remain internal, no?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weissi I think it might need to be public because syncShutdown() calls syncShutdown(requiresCleanClose: false) for backwards compatibility, but since users should preferably do a clean close, they need to be able to call syncShutdown(requiresCleanClose: true). WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if you were a user of this library, wouldn't you be super confused on which shutdown to use and what parameter to pass? I'd just not expose syncShutdown(requiresCleanClose:) (can be used in tests of course). If we feel that it's super important, we can always make it public later

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried about the user shutting down the client too early and having difficulties understanding why all their tasks are getting cancelled, but I recognise requiresCleanClose is a bit confusing and the docs I added are not as much clear as they could be. I'll remove it for now then, never too late to add it later indeed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you planning to remove this?

Copy link
ContributorAuthor

@PopFlamingoPopFlamingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments from looking at the test coverage

// if the `throw` wasn't delayed until everything
// is executed
if closeError ==nil{
closeError = error
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any other possible error than ChannelError.alreadyClosed on close? If there is a reproducible one it might be good to add a test that makes it so that this function throws

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on the channel handlers in the pipeline to be precise. I think ignoring alreadyClosed and surfacing everything else makes sense

Copy link
Contributor

@weissiweissi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is looking really good! I think there's a race left though (and a few nits)

@PopFlamingo
Copy link
ContributorAuthor

Thanks for the new test of 92aedbd@weissi

Copy link
Collaborator

@artemredkinartemredkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good! I have one question to discuss: right now we have fixed maximum number of concurrent connections, should we make it a configuration option? Or skip for now? What are your thoughts?

@weissi
Copy link
Contributor

@artemredkin why not start with fixed and add config later in a separate PR?

@PopFlamingo
Copy link
ContributorAuthor

@artemredkin Thanks!

Latest commit fixes the bugs found by @weissi

@weissiweissi modified the milestones: 1.1.0, 1.2.0Feb 12, 2020
@weissiweissi requested a review from LukasaFebruary 12, 2020 18:08
}

/// Get a new connection from this provider
func getConnection(preference: HTTPClient.EventLoopPreference) -> EventLoopFuture<Connection>{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some sense it doesn't matter which event loop this future returns on, so long as it is always consistent about that choice. Otherwise it becomes impossible to write thread-safe programs. Given that we allow users to express an event loop preference here it seems sensible to follow the idea that the event loop should be returned on their preference.

There are two good choices of which event loop the future callback should run on:

  1. The event loop of the returned connection.
  2. The event loop of the caller.

Both of these are defensible, depending on what the user is exactly trying to do. It doesn't matter too much which of those we choose, but we should be attempting to guarantee that this is the case. Here we lose track of what the event loop of the future we're returning is, so it'd be good to keep track of it.

/// throw the appropriate error if needed. For instance, if its internal connection pool has any non-released connections,
/// this indicate shutdown was called too early before tasks were completed or explicitly canceled.
/// In general, setting this parameter to `true` should make it easier and faster to catch related programming errors.
publicfunc syncShutdown(requiresCleanClose:Bool)throws{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you planning to remove this?

@PopFlamingo
Copy link
ContributorAuthor

@Lukasa Thanks for the NITs, I'll check and fix your other reviews soon

@Lukasa
Copy link
Collaborator

Great, please @ me when you're ready for a re-review.

@PopFlamingo
Copy link
ContributorAuthor

This PR is ready for a new round of reviews! 🙂
All FIXMEs and changes requests have now been addressed, and the latest commits contain fixes for all unhandled errors and synchronisation issues I knew of.

If you see no additional issues, I think the PR is ready to be merged. I'll squash the commits and make a correct commit message when the PR is approved.

Thanks

Copy link
Contributor

@weissiweissi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! I am happy to merge this as is. This is a very major milestone for AsyncHTTPClient, thanks everybody involved, most importantly @adtrevor

Copy link
Collaborator

@artemredkinartemredkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, thank you @adtrevor !

motivation: Better performance thanks to connection reuse changes: - Added a connection pool for HTTP/1.1 - All requests automatically use the connection pool - Up to 8 parallel connections per (scheme, host, port) - Multiple additional unit tests
Copy link
Collaborator

@LukasaLukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it. Thanks so much for your work @adtrevor!

@LukasaLukasa merged commit 19e2ea7 into swift-server:masterFeb 25, 2020
@kylebrowning
Copy link

YAY!

@PopFlamingo
Copy link
ContributorAuthor

Thank you all for your reviews and advices, I learned a lot doing this! 🙂

Special thanks to you too @kylebrowning for bringing this back up a few months ago!

Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🆕 semver/minorAdds new public API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants

@PopFlamingo@Lukasa@weissi@kylebrowning@adam-fowler@tomerd@artemredkin@t089