How HTTP1.1 protocol is implemented in Golang net/http package: part one - request workflow
Background
In this article, I’ll write about one topic: how to implement the HTTP protocol. I keep planning to write about this topic for a long time. In my previous articles, I already wrote several articles about HTTP protocol:
I recommend you to read these articles above before this one.
As you know, HTTP protocol is in the application layer, which is the closest one to the end-user in the protocol stack.
So relatively speaking, HTTP protocol is not as mysterious as other protocols in the lower layers of this stack. Software engineers use HTTP every day and take it for granted. Have you ever thought about how we can implement a fully functional HTTP protocol library?
It turns out to be a very complex and big work in terms of software engineering. Frankly speaking, I can’t work it out by myself in a short period. So in this article, we’ll try to understand how to do it by investigating Golang net/http package as an example. We’ll read a lot of source code and draw diagrams to help your understanding of the source code.
Note HTTP protocol itself has evolved a lot from HTTP1.1 to HTTP2 and HTTP3, not to mention HTTPS. In this article, we’ll focus on the mechanism of HTTP1.1, but what you learned here can help you understand other new versions of HTTP protocol.
Note HTTP protocol is on the basis of client-server model. This article will focus on the client-side. For the HTTP server part, I’ll write another article next.
Main workflow of http.Client
HTTP client’s request starts from the application’s call to Get method of net/http package, and ends by writing the HTTP message to the TCP socket. The whole workflow can be simplified to the following diagram:
First, the public Get method calls Get method of DefaultClient, which is a global variable of type Client,
I’ll not show the function body of NewRequestWithContext, since it’s very long. But only paste the block of code for actually building the Request object as follows:
1 2 3 4 5 6 7
req := &Request{ // omit some code Proto: "HTTP/1.1", // the default HTTP protocol version is set to 1.1 ProtoMajor: 1, ProtoMinor: 1, // omit some code }
Note that by default the HTTP protocol version is set to 1.1. If you want to send HTTP2 request, then you need other solutions, and I’ll write about it in other articles.
Next, Do method is called, which delegates the work to the private do method.
do method handles the HTTP redirect behavior, which is very interesting. But since the code block is too long, I’ll not show its function body here. You can refer to the source code of it here.
Next, send method of Client is called which goes as follows:
Transport is extremely important for HTTP client workflow. Let’s examine how it works bit by bit. First of all, it’s type of RoundTripper interface.
1 2 3 4 5 6 7
// this interface is defined inside client.go file
type RoundTripper interface { // RoundTrip executes a single HTTP transaction, returning // a Response for the provided Request. RoundTrip(*Request) (*Response, error) }
RoundTripper interface only defines one method RoundTrip, all right.
If you don’t have any special settings, the DefaultTransport will be used for c.Transport above.
// send method in client.go funcsend(ireq *Request, rt RoundTripper, deadline time.Time) (resp *Response, didTimeout func()bool, err error) { req := ireq // req is either the original request, or a modified fork
if rt == nil { req.closeBody() returnnil, alwaysFalse, errors.New("http: no Client.Transport or DefaultTransport") }
if req.RequestURI != "" { req.closeBody() returnnil, alwaysFalse, errors.New("http: Request.RequestURI can't be set in client requests") }
// forkReq forks req into a shallow clone of ireq the first // time it's called. forkReq := func() { if ireq == req { req = new(Request) *req = *ireq // shallow clone } }
// Most the callers of send (Get, Post, et al) don't need // Headers, leaving it uninitialized. We guarantee to the // Transport that this has been initialized, though. if req.Header == nil { forkReq() req.Header = make(Header) }
if u := req.URL.User; u != nil && req.Header.Get("Authorization") == "" { username := u.Username() password, _ := u.Password() forkReq() req.Header = cloneOrMakeHeader(ireq.Header) req.Header.Set("Authorization", "Basic "+basicAuth(username, password)) }
resp, err = rt.RoundTrip(req) if err != nil { stopTimer() if resp != nil { log.Printf("RoundTripper returned a response & error; ignoring response") } if tlsErr, ok := err.(tls.RecordHeaderError); ok { // If we get a bad TLS record header, check to see if the // response looks like HTTP and give a more helpful error. // See golang.org/issue/11111. ifstring(tlsErr.RecordHeader[:]) == "HTTP/" { err = errors.New("http: server gave HTTP response to HTTPS client") } } returnnil, didTimeout, err } if resp == nil { returnnil, didTimeout, fmt.Errorf("http: RoundTripper implementation (%T) returned a nil *Response with a nil error", rt) } if resp.Body == nil { // The documentation on the Body field says “The http Client and Transport // guarantee that Body is always non-nil, even on responses without a body // or responses with a zero-length body.” Unfortunately, we didn't document // that same constraint for arbitrary RoundTripper implementations, and // RoundTripper implementations in the wild (mostly in tests) assume that // they can use a nil Body to mean an empty one (similar to Request.Body). // (See https://golang.org/issue/38095.) // // If the ContentLength allows the Body to be empty, fill in an empty one // here to ensure that it is non-nil. if resp.ContentLength > 0 && req.Method != "HEAD" { returnnil, didTimeout, fmt.Errorf("http: RoundTripper implementation (%T) returned a *Response with content length %d but a nil Body", rt, resp.ContentLength) } resp.Body = ioutil.NopCloser(strings.NewReader("")) } if !deadline.IsZero() { resp.Body = &cancelTimerBody{ stop: stopTimer, rc: resp.Body, reqDidTimeout: didTimeout, } } return resp, nil, nil }
At line 50 of send method above:
1
resp, err = rt.RoundTrip(req)
RoundTrip method is called to send the request. Based on the comments in the source code, you can understand it in the following way:
RoundTripper is an interface representing the ability to execute a single HTTP transaction, obtaining the Response for a given Request.
if altRT := t.alternateRoundTripper(req); altRT != nil { if resp, err := altRT.RoundTrip(req); err != ErrSkipAltProtocol { return resp, err } var err error req, err = rewindBody(req) if err != nil { returnnil, err } } if !isHTTP { req.closeBody() returnnil, badStringError("unsupported protocol scheme", scheme) } if req.Method != "" && !validMethod(req.Method) { req.closeBody() returnnil, fmt.Errorf("net/http: invalid method %q", req.Method) } if req.URL.Host == "" { req.closeBody() returnnil, errors.New("http: no Host in request URL") }
for { select { case <-ctx.Done(): req.closeBody() returnnil, ctx.Err() default: }
// treq gets modified by roundTrip, so we need to recreate for each retry. treq := &transportRequest{Request: req, trace: trace, cancelKey: cancelKey} cm, err := t.connectMethodForRequest(treq) if err != nil { req.closeBody() returnnil, err }
// Get the cached or newly-created connection to either the // host (for http or https), the http proxy, or the http proxy // pre-CONNECTed to https server. In any case, we'll be ready // to send it requests. pconn, err := t.getConn(treq, cm) if err != nil { t.setReqCanceler(cancelKey, nil) req.closeBody() returnnil, err }
var resp *Response if pconn.alt != nil { // HTTP/2 path. t.setReqCanceler(cancelKey, nil) // not cancelable with CancelRequest resp, err = pconn.alt.RoundTrip(req) } else { resp, err = pconn.roundTrip(treq) } if err == nil { resp.Request = origReq return resp, nil }
// Failed. Clean up and determine whether to retry. if http2isNoCachedConnError(err) { if t.removeIdleConn(pconn) { t.decConnsPerHost(pconn.cacheKey) } } elseif !pconn.shouldRetryRequest(req, err) { // Issue 16465: return underlying net.Conn.Read error from peek, // as we've historically done. if e, ok := err.(transportReadFromServerError); ok { err = e.err } returnnil, err } testHookRoundTripRetried()
// Rewind the body if we're able to. req, err = rewindBody(req) if err != nil { returnnil, err } } }
There are three key points:
at line 70, a new variable of type transportRequest, which embeds Request, is created.
at line 81, getConn method is called, which implements the cached connection pool to support the persistent connection mode. Of course, if no cached connection is available, a new connection will be created and added to the connection pool. I will explain this behavior in detail next section.
from line 89 to line 95, pconn.roundTrip is called. The name of variable pconn is self-explaining which means it is type of persistConn.
transportRequest is passed as parameter to getConn method, which returns pconn. pconn.roundTrip is called to execute the HTTP request. we have covered all the steps in the above workflow diagram.
Summary
In this first article of this series, we talked about the workflow of sending an HTTP request step by step. And I’ll discuss how to send the HTTP message to the TCP stack in the second article.