Fetzen aus Java & Vavr

»Vavr Stream vs Iterator

vavr’s Stream is

  • traversable multiple times
  • evaluated lazily
  • cached after evaulation

Java’s and vavr’s Stream are quite different. To have the equivalent of Java’s Stream in vavr, you have to use Iterator.

Source

»Class instance of generic type T

Due to Type Erasure it is really difficult to get a Class<T> instance of T. Common solution is to supply the Class<T> object as constructor argument.

public class Foo<T> {
    Class<T> clazz;
    public Foo(Class<T> clazz) {
        this.clazz = clazz;
    }
}

Instantiate with new Foo<>(String.class);

»Handle big, but not too big files from S3 buckets

So you have files up to 1 GiB on your S3 storage and you need to read them in on a regular basis?

  1. read them in the memory if small.
  2. download them to temporary files, read them and delete afterwards.

In particular the »delete afterwards« can be tricky in Java:

public InputStream getContentStream(String key) throws IOException {
  S3Object object = s3Client.getObject(bucket, key);
  long size = object.getObjectMetadata().getContentLength();
  if (size > MAX_IN_MEMORY_SIZE) {
    return downloadAndOpen(object);
  } else {
    return object.getObjectContent();
  }
}

private InputStream downloadAndOpen(S3Object object) throws IOException {
  Path tempFile = Files.createTempFile(bucket + "-", null);
  try {
    s3Client.getObject(new GetObjectRequest(bucket, object.getKey()), tempFile.toFile());
  } catch (Exception ex) {
    Files.delete(tempFile);
    throw ex;
  }

  return Files.newInputStream(tempFile, StandardOpenOption.DELETE_ON_CLOSE);
}

Use getContentStream with try with resources.

»Character encoding of files from S3 buckets.

Don’t ask why, but recently I had to read obscure files in what I believe is COBOL copybook format. These files were either encoded in EBCDIC 037 (aka »CP037«) or in ASCII/UTF-8, depending on the system that wrote them. If you read »The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)«, you might already get the idea that just reading the data as UTF-8 String, check which encoding it is and then re-encode it in a different format can result in surprises. Usually this is a non-issue, because you can just read a file as InputStream from disk, convert it to a PushbackInputStream, peek the first bytes to determine the encoding.

But what if you get your data from an S3 bucket or similar. In this case I came to the conclusion: try to get the data as InputStream or get and keep it as byte[] array till you convert. In my case the copybooks always start with a number, so I checked if the first byte was a number:

public static String decodeContent(byte[] data) {
  if (data == null) {
    return null;
  }
  if (data.length == 0) {
    return "";
  }

  // we expect a digit as first character
  String probe = new String(new byte[]{data[0]}, StandardCharsets.UTF_8);
  if (Character.isDigit(probe.charAt(0))) {
    return new String(data, StandardCharsets.UTF_8);
  }

  // so no digit, is it perhaps EBCDIC 037?
  probe = new String(new byte[]{data[0]}, Charset.forName("cp037"));
  if (Character.isDigit(probe.charAt(0))) {
    return new String(data, Charset.forName("cp037"));
  }

  throw RuntimeException("something went wrong");
}
»Domain error template

In case you plan to use Validations, it can be handy to have a DomainError class that can serve as parent for more specific errors.

package org.bargsten.common;

import lombok.Getter;

@Getter
public class DomainError {

  protected String message;
  protected Throwable exception;

  public RuntimeException toException() {
    return new RuntimeException(message, exception);
  }
}
»Read reference files from the resource folder.

For testing correct file outputs, I tend to have the reference files with the expected result in the resource folder. And to streamline things, I use these convenience functions (path needs to start with / to target the right resource file).

package org.bargsten.testkit;

import java.io.IOException;
import java.io.InputStream;
import java.nio.charset.StandardCharsets;
import java.util.zip.GZIPInputStream;
import org.apache.commons.io.IOUtils;

public final class TestFileUtil {

  private TestFileUtil() {
  }

  public static String getResourceContent(Object o, String path) throws IOException {
    InputStream resourceAsStream = o.getClass().getResourceAsStream(path);
    return IOUtils.toString(resourceAsStream, StandardCharsets.UTF_8);
  }

  public static byte[] getResourceBytes(Object o, String path) throws IOException {
    InputStream resourceAsStream = o.getClass().getResourceAsStream(path);
    return IOUtils.toByteArray(resourceAsStream);
  }

  public static String getGzippedResourceContent(Object o, String path) throws IOException {
    InputStream resourceAsStream = o.getClass().getResourceAsStream(path);
    return IOUtils.toString(new GZIPInputStream(resourceAsStream), StandardCharsets.UTF_8);
  }

  public static byte[] getGzippedResourceBytes(Object o, String path) throws IOException {
    InputStream resourceAsStream = o.getClass().getResourceAsStream(path);
    return IOUtils.toByteArray(new GZIPInputStream(resourceAsStream));
  }
}
»Java Testing with Mockito & Assertj

I always forget them. Usually no problem, because you have plenty of examples in your current project. But what if you change clients?

Dependencies

Documentation

Junit5 & Mockito

@ExtendWith(MockitoExtension.class)
public class MonkeyServiceTest {
  
  @Mock(answer = Answers.RETURNS_SMART_NULLS)
  ForestService forestService;

  @InjectMocks
  MonkeyService monkeyService;

  @Spy
  Config config = ConfigFactory.load();
  
  // capture args for classes with nested type parameters
  @Captor
  ArgumentCaptor<List<Monkey>> monkeyCaptor;

  @Test
  @SneakyThrows
  void shouldEatBanana() {

    // void with retrieving temp file contents
    ArrayList<String> content = new ArrayList<>();
    doAnswer(i -> {
      Path f = i.getArgument(1, Path.class);
      content.add(TestFileUtil.getFileContent(f));
      return null;
    }).when(forestService).write(any(), any());

    when(forestService.setArea(any(AreaRequest.class), any())).then(invocation -> {
          // ...
          return null;
        }
    );
    final ArgumentCaptor<String> contentCaptor = ArgumentCaptor.forClass(String.class);
    final ArgumentCaptor<String> treeCaptor = ArgumentCaptor.forClass(String.class);
    monkeyService.eatAll();
    verify(forestService, times(1)).shakeTree(treeCaptor.capture(), contentCaptor.capture());
  }
  // ...
}

Assert exceptions

import static org.assertj.core.api.Assertions.assertThatThrownBy;

assertThatThrownBy(() -> monkeyService.getMonkey(id)).isInstanceOf(NoRainforestException.class);

Assert static methods

Activate required Mockito extension:

mkdir -p src/test/resources/mockito-extensions
echo mock-maker-inline > src/test/resources/mockito-extensions/org.mockito.plugins.MockMaker

In the test:

try (MockedStatic<TimeProvider> mockedTimeProvider = mockStatic(TimeProvider.class)) {
  mockedTimeProvider
    .when(TimeProvider::now)
    .thenReturn(LocalDateTime.of(2000, 1, 1, 1, 1, 1));
  
  //...
}
»Vavr, Stream.unfold and type inference

I came across Stream.unfold. It seems to be a good solution if you need to read paginated results from a service or database. You can pass the current page or result as parameter to the function that is called in the unfold expression.

Only, Java runs into a type constraint issue for functions that are not inlined lambdas. You’ll get something like: “Incompatible equality constraint” when trying to use a function reference in Stream.unfold, Stream.unfoldRight or Stream.unfoldLeft.

It took me some time to figure out that it boils down to (a valid) type constraint in the Java compiler. To actually get the desired result, the function signature needs to be adjusted. Let’s assume something like this:

FooBarRepository repo = new FooBarRepository();

List<String> result = Stream
        .unfoldRight(1, repo::generateChunk)
        .flatMap(List::toStream).take(50).toList();
result.forEach(e -> log.info("element: {}", e));

Instead of declaring the generateChunk function like this:

public Option<Tuple2<List<String>, Integer>> generateChunk(int counter) {
  //...
}

adapt the signature to:

public Option<Tuple2<? extends List<String>, ? extends Integer>> generateChunk(int counter) {
  //...
}
»Dev tools for Java development
»Short circuit vavr Stream

If you are in a Stream, have to deal with Exceptions and still want to see which elements were processed successfully, short circuiting makes sense.

Stream<S3ObjectSummary> files = s3Service.listFiles();

files
  .map(S3ObjectSummary::getKey)
  .map(this::readFile)
  //short circuit stream if we have an error during reading
  .takeUntil(Try::isFailure).flatMap(t -> t)
  .map(this::handleFile)
  //short circuit stream if we have an error during handling
  .takeUntil(Try::isFailure).flatMap(t -> t)
  .map(this::cleanFile)
  //short circuit stream if we have an error during cleaning
  .takeUntil(Try::isFailure).flatMap(t -> t);
}
»Wrap exceptions in vavr Try

In current vavr (0.10.3), I really miss a generic mapFailure that I can use to wrap an earlier caught exception in a new exception. I came up with this helper funcion:

public class TryUtil {
  public static <T, R> API.Match.Case<T, R> wrap(Function<? super T, ? extends R> f) {
    Objects.requireNonNull(f, "f is null");

    return API.Case(API.$(), f);
  }
}

It can be used like this:

return Try.of(() -> s3Service.getObjectAsString(key))
        .mapFailure(wrap(ex -> new ProcessingException(key, ex)));
»Useful dependencies

mvn-repository

»Deprecate methods in Java

Methods (or elements) can be marked as deprecated with the @Deprecated annotation in Java. To make the work of your colleagues (or library users) easier, you can even go a step futher and add Javadoc documentation so that IDEs, such as Intellij, can suggest replacements methods:

public class MathUtil {

    /**
     * @deprecated
     * use {@link MathUtil#newAndBetterSum(int, int)}
     */
    @Deprecated
    public static int sum(int a, int b) {
        return a + b;
    }

    public static int newAndBetterSum(int a, int b) {
        return a + b;
    }
}