Limit the maximum body size for OkHttp responses

(, en)

In the Java/Kotlin/Scala world, OkHttp is a common HTTP client. It even has a backend for sttp (the Scala HTTP client you always wanted!). I personally advocate async HTTP clients, but in a lot of cases it is sufficient or needed to use a sync client. Combine this with the requirement of downloading untrusted content from the internet. It could be a crawler for your favourite machine-learning project or your service needs to download images from clients.

Downloading stuff from »the internet« can be dangerous. Certain Denial-of-Service attacks can use the body size or content as attack vector. In an async context it is quite simple to stream the response body and cap the size at a specified maximum. However, I needed to do it with the sync version of OkHttp and things got more complicated than I expected. akka-http has a withSizeLimit directive, but OkHttp lacks such functionality. It provides a more generic approach: the possibility to »intercept« responses.

How does it work? A simple approach

(the code is in Scala, but it is straight forward to transfer it to other JVM-based languages)

Let’s get the boilerplate out of the way:

// src/main/scala/HttpClient.scala
import okhttp3._
import okio._

import java.io.IOException

class StreamSizeException(message: String, cause: Option[Throwable] = None)
    extends IOException(message, cause.orNull)

The StreamSizeException we’ll use to blow things up, when the maximum body size is reached.

A straight forward implementation with interceptors then looks like this:

// src/main/scala/HttpClient.scala
def buildSimpleSizeBoundedClient(maxBytes: Long): OkHttpClient = {
  new OkHttpClient.Builder()
    .addInterceptor((chain: Interceptor.Chain) => {
      val res = chain.proceed(chain.request())

      if (res.body().contentLength() > maxBytes)
        throw new StreamSizeException(s"content length exceeds maximum ($maxBytes bytes)")

      val body = res.peekBody(maxBytes + 1)
      if (body.contentLength() > maxBytes)
        throw new StreamSizeException(s"response body too big (>$maxBytes bytes)")
      res.newBuilder().body(body).build();
    })
    .build()
}
  1. we check the content-length
  2. we check if the body exceeds the size limit by »peeking« the body of the response

That works ok, but two disadvantages pop up. In the test it becomes clear:

// src/test/scala/HttpClientSpec.scala
it should "make KABOOOM for our simple solution" in {
  val length = 1001
  val body = Array.fill[Byte](length)(1)

  stubFor(get(urlEqualTo("/abc")).willReturn(ok().withBody(body)))

  val req = new Request.Builder()
    .url(s"http://$Host:$port/abc")
    .get()
    .build()
  val client = buildSimpleSizeBoundedClient(1000)
  val call = client.newCall(req)

  // here it throws at the execute stage, even though we might not have the
  // intention to consume the body
  an[StreamSizeException] should be thrownBy call.execute()
}
  1. an exception is thrown when we execute the call. That makes it difficult to inspect the response or proceed in situations where we want to ignore the body.
  2. we need to realise the body partly to figure out if we violate the limit.

Getting it right

We can prevent the disadvantages by throwing more code at the problem. As always.

// src/main/scala/HttpClient.scala
def buildSizeBoundedClient(maxBytes: Long): OkHttpClient = {
  new OkHttpClient.Builder()
    .addInterceptor((chain: Interceptor.Chain) => {
      val res = chain.proceed(chain.request())

      // here is where the magic happens
      val body = new BoundedResponseBody(res.body(), maxBytes)

      res.newBuilder().body(body).build();
    })
    .build()
}

To get this working, I introduced BoundedResponseBody and BoundedSource to wrap the response body. BoundedResponseBody is a shell for BoundedSource so we can get it working with OkHttp.

// src/main/scala/HttpClient.scala
class BoundedResponseBody(body: ResponseBody, maxBytes: Long) extends ResponseBody {
  val boundedSource = new BoundedSource(body.source(), body.contentLength(), maxBytes)

  override def contentType(): MediaType = body.contentType()

  override def contentLength(): Long = body.contentLength()

  override def source(): BufferedSource = Okio.buffer(boundedSource)
}

The magic happens in BoundedSource, here we check the Content-Length and also the body stream while it is consumed.

// src/main/scala/HttpClient.scala
class BoundedSource(source: Source, contentLength: Long, maxBytes: Long)
    extends ForwardingSource(source) {
  var bytesReadSoFar = 0L
  override def read(sink: Buffer, byteCount: Long): Long = {
    // We do the contentLength check late, because we only want to throw when the body is consumed,
    // not if it is constructed. Perhaps the caller does not even want to access the body.
    if (contentLength > maxBytes)
      throw new StreamSizeException(s"content length exceeds maximum ($maxBytes bytes)")

    // Make sure we don't read more into buffer than necessary
    val effectiveByteCount = Math.min(byteCount, Math.max(1, maxBytes - bytesReadSoFar))

    val bytesRead = super.read(sink, effectiveByteCount)
    if (bytesRead > 0)
      bytesReadSoFar += bytesRead

    if (bytesReadSoFar > maxBytes)
      throw new StreamSizeException(s"maximum body size exceeded ($maxBytes bytes)")

    bytesRead
  }
}

In the test you can see that the exception shows up at the right spot, when we want to consume the body:

// src/test/scala/HttpClientSpec.scala
it should "make KABOOOM" in {
  val length = 1001
  val body = Array.fill[Byte](length)(1)

  stubFor(get(urlEqualTo("/abc")).willReturn(ok().withBody(body)))

  val req = new Request.Builder()
    .url(s"http://$Host:$port/abc")
    .get()
    .build()
  val client = buildSizeBoundedClient(1000)
  // here we can execute the call
  val res = client.newCall(req).execute()
  // and it breaks only when we want to consume the body
  an[StreamSizeException] should be thrownBy res.body().bytes()
}

I hope this helps.

Have fun!