The Art of Handling Infinite Input in GO Web Handlers
Reading /r/golang topics which post code, there are a few recurring bugs that not just newbies make.
One such bug is in the handling of HTTP handlers. Can you spot it in this toy example that's for demonstration purposes only?
package main import ( "encoding/json" "fmt" "io" "log" "net/http" ) type whateverJson map[string]any func unmarshalRequest(req *http.Request, target any) error { body, err := io.ReadAll(req.Body) if err != nil { return fmt.Errorf("failed to read body: %w", err) } err = json.Unmarshal(body, target) if err != nil { return fmt.Errorf("failed to unmarshal json: %w, body was: %s", err, string(body)) } return nil } func main() { http.HandleFunc("/", func(w http.ResponseWriter, req *http.Request) { t := whateverJson{} err := unmarshalRequest(req, &t) if err != nil { http.Error(w, err.Error(), 400) return } fmt.Fprintf(w, "got: %#v\n", t) }) log.Fatal(http.ListenAndServe(":8080", nil)) }
Knowledge of the HTTP protocol when writing HTTP handlers is crucial. The key to the bug is knowing that request bodies can be arbitrarily big, so we must specify the maximum amount of data that we want to handle. The bug is not using MaxBytesReader to do so. An adversary could send us a multi-gigabyte request, and we would buffer it all in-memory, possibly causing a Denial Of Service issue. There are multiple other issues with this code as well, lets go through them:
- Do we really need to buffer the request body in memory?
Apart from helping with debugging, there can also be requirements that mean we do indeed need to buffer it all, but in all other cases we should use the streaming decoder.
Since we are not really interested in the JSON "as a whole", but the data it encodes, we can optimize the decoding this way.
- Decoding JSON into
map[string]any
is an antipattern and leads to ugly code. It rarely happens that the JSON data we want to support is so generic that we can only represent it this way.
Maybe we could partially decode the JSON first, to see what type we should unmarshal it into.
- We didn't customize the HTTP server's timeouts, and we probably should depending on our usecase.
The defaults are:- no ReadTimeout meaning our handler can take as much time as it needs to process the request before reading the request body
- no ReadHeaderTimeout meaning the client can be infinitely slow in sending us the HTTP headers
- no WriteTimeout so our handler can be infinitely slow in giving a response
- no IdleTimeout so we will keep the TCP connection open indefinitely when keep-alives are enabled
A marginally better version of the code above is:
package main import ( "encoding/json" "fmt" "io" "log" "net/http" "time" ) type whateverJson map[string]any const requestBodyLimit = 2 * 1024 * 1024 func unmarshalRequest(r io.Reader, target any) error { dec := json.NewDecoder(r) err := dec.Decode(target) if err != nil { return fmt.Errorf("failed to unmarshal json: %w", err) } return nil } func main() { srv := http.Server{ Addr: ":8080", ReadHeaderTimeout: time.Duration(30 * time.Second), IdleTimeout: time.Duration(30 * time.Second), } http.HandleFunc("/", func(w http.ResponseWriter, req *http.Request) { t := whateverJson{} r := http.MaxBytesReader(w, req.Body, requestBodyLimit) err := unmarshalRequest(r, &t) if err != nil { http.Error(w, err.Error(), 400) return } fmt.Fprintf(w, "got: %#v\n", t) }) log.Fatal(srv.ListenAndServe()) }
Sadly the defaults are now impossible to change, as that would be a backward-compatibility break, which is a big no-no.
The compatibility promise is crucial to the health of the Go ecosystem so the only thing to do is to be aware of these issues.
If you would like some additional information with regards to exposing your HTTP server on the net, gopheracademy has some very useful information.
Link: https://blog.gopheracademy.com/advent-2016/exposing-go-on-the-internet/