Rational Java: java8

Showing posts with label java8. Show all posts

Tuesday, 23 June 2015

Java8: Generate a Random String in One Line

This is a program which demonstrates the power of Java 8 streaming to generate random strings in Java. It chains the filter, map, limit, and collect to turn random numbers into the required string.

The program will output random strings containing numbers [0-9] and letters [a-z,A-Z].

The numbers in the code map to UniCode characters (for a full Unicode chart see here).

An explanation of the code is as follows:

Generate random numbers within the range 48 (unicode for 0) to 122 (unicode for z).
Only allow numbers less than 57 (the digits 0-9) or greater than 65 and less than 90 (letters A-Z) or great than 97 (the letters A-Z).
Map each number to a char.
Stop when you have the required length of the string.
Collect the chars produced into a StringBuilder
Turn the StringBuilder in a String and return

Thursday, 30 April 2015

Cheating with Exceptions - Java 8 Lambdas

Leaving aside the religious debate about Checked vs Runtime exceptions, there are times where due to poorly constructed libraries, dealing with checked examples can drive you insane.

Consider this snippet of code which you might want to write:

public void createTempFileForKey(String key) {
  Map<String, File> tempFiles = new ConcurrentHashMap<>();
  //does not compile because it throws an IOException!!
  tempFiles.computeIfAbsent(key, k -> File.createTempFile(key, ".tmp"));
}

For it to compile you need to catch the exception which leaves you with this code:

public void createTempFileForKey(String key) {
    Map<String, File> tempFiles = new ConcurrentHashMap<>();
    tempFiles.computeIfAbsent(key, k -> {
        try {
            return File.createTempFile(key, ".tmp");
        }catch(IOException e) {
            e.printStackTrace();
            return null;
        }
    });
}

Although this compiles, the IOException has effectively been swallowed. The user of this method should be informed that an Exception has been thrown.

To address this you could wrap the IOException in a generic RuntimeException as below:

public void createTempFileForKey(String key) throws RuntimeException {
    Map<String, File> tempFiles = new ConcurrentHashMap<>();
    tempFiles.computeIfAbsent(key, k -> {
        try {
            return File.createTempFile(key, ".tmp");
        }catch(IOException e) {
            throw new RuntimeException(e);
        }
    });
}

This code does throw an Exception but not the actual IOException which was intended to be thrown by the code. It's possible that those in favour of RuntimeExceptions only would be happy with this code especially if the solution could be refined to created a customised IORuntimeException. Nevertheless the way most people code, they would expect their method to be able to throw the checked IOException from the File.createTempFile method.

The natural way to do this is a little convoluted and looks like this:

public void createTempFileForKey(String key) throws IOException{
        Map<String, File> tempFiles = new ConcurrentHashMap<>();
        try {
            tempFiles.computeIfAbsent(key, k -> {
                try {
                    return File.createTempFile(key, ".tmp");
                } catch (IOException e) {
                    throw new RuntimeException(e);
                }
            });
        }catch(RuntimeException e){
            if(e.getCause() instanceof IOException){
                throw (IOException)e.getCause();
            }
        }
}

From inside the lambda, you would have to catch the IOException, wrap it in a RuntimeException and throw that RuntimeException. The lambda would have to catch the RuntimeException unpack and rethrow the IOException. All very ugly indeed!

In an ideal world what we need is to be able to do is to throw the checked exception from within the lambda without having to change the declaration of computeIfAbsent. In other words, to throw a check exception as if it were an runtime exception. But unfortunately Java doesn't let us do that...

That is not unless we cheat! Here are two methods for doing precisely what we want, throwing a checked exception as if it were a runtime exception.

Method 1 - Using generics:

    public static void main(String[] args){
        doThrow(new IOException());
    }

    static void doThrow(Exception e) {
        CheckedException.<RuntimeException> doThrow0(e);
    }

    static <E extends Exception>
      void doThrow0(Exception e) throws E {
          throw (E) e;
    }

Note that we have create and thrown an IOException without it being declared in the main method.

Method 2 - Using Unsafe:

public static void main(String[] args){
        getUnsafe().throwException(new IOException());
    }

    private static Unsafe getUnsafe(){
        try {
            Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
            theUnsafe.setAccessible(true);
            return (Unsafe) theUnsafe.get(null);
        } catch (Exception e) {
            throw new AssertionError(e);
        }
    }

Again we have managed to throw an IOException without having declared it in the method.

Whichever method you prefer we are now free to write the original code in this way:

public void createTempFileForKey(String key) throws IOException{
        Map<String, File> tempFiles = new ConcurrentHashMap<>();

        tempFiles.computeIfAbsent(key, k -> {
            try {
                return File.createTempFile(key, ".tmp");
            } catch (IOException e) {
                throw doThrow(e);
            }
        });
    }
    
    private RuntimeException doThrow(Exception e){
        getUnsafe().throwException(e);
        return new RuntimeException();
    }

    private static Unsafe getUnsafe(){
        try {
            Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
            theUnsafe.setAccessible(true);
            return (Unsafe) theUnsafe.get(null);
        } catch (Exception e) {
            throw new AssertionError(e);
        }
    }

The doThrow() method would obviously be encapsulated in some utility class leaving your code in createTempFileForKey() pretty clean.

Tuesday, 7 April 2015

Java 8 Lambdas in One Line

If you understand this line, or better still can write this code you can pretty much say that you have understood the essence of Java 8 Lambdas. Certainly in as much as they can be used with collections.

I found this in a recent presentation by Peter Lawrey. (Definitely worth watching the whole presentation when you have a spare hour.)

Anyway the task was to find the 20 most frequent words in a file:

As you can see, with Java 8 this can actually be done in a single (albeit rather long) line.

If you're not used to lambdas the code might seem a little scary but actually it's pretty declarative and when you get past the logic reads fairly easily.

Wednesday, 18 February 2015

Java 8 pitfall - Beware of Files.lines()

There's a really nice new feature in Java8 which allows you to get a stream of Strings from a file in a one liner.

List lines = Files.lines(path).collect(Collectors.toList());

You can manipulate the Stream as you would with any other Stream for example you might want to filter() or map() or limit() or skip() etc.

I started using this all over my code until I was hit with this exception,

Caused by: java.nio.file.FileSystemException: /tmp/date.txt: Too many open files in system
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.Files.newByteChannel(Files.java:361)
at java.nio.file.Files.newByteChannel(Files.java:407)
at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
at java.nio.file.Files.newInputStream(Files.java:152)
at java.nio.file.Files.newBufferedReader(Files.java:2784)
at java.nio.file.Files.lines(Files.java:3744)
at java.nio.file.Files.lines(Files.java:3785)

For some reason I had too many open files! Odd, doesn't Files.lines() close the file?

See code below (run3()) where I've created reproduced the issue:

My code looked something like run3() which produced the exception. I proved this by running the unix command lsof (lists open files) and noticing many many instances of date.txt open. To check that the problem was indeed with Files.lines() I made sure that the code ran with run1() using a BufferedReader, which it did. By reading through the source code for Files I realised that the Stream need to be created in an auto closable. When I implemented that in run2() the code ran fine again.

In my opinion I don't think that this is not particularly intuitive. It really spoils the one liner when you have to use the auto closable. I guess that the code does need a signal as to when to close the file but somehow it would be nice if that was hidden from us. At the very least it should be highlighted in the JavaDoc which it is not :-)

Friday, 30 January 2015

Java8 Multi-threading ForkJoinPool: Dealing with exceptions

One of the main motivations behind the introduction of Java8 lambdas was the ability to be able to use multicores as easily as possible (see Mastering Lambdas: Java Programming in a Multicore World). By simply changing your code from collection.stream()... to collection.parallelStream()... you have instant multi-threading at your disposal which brings with it all the CPU power on your machine. (Let's ignore contention at this point.)

If you print out the names of the threads used by parallelStream you will notice that they are the same threads used by the ForkJoin framework and look something like this:

[ForkJoinPool.commonPool-worker-1]
[ForkJoinPool.commonPool-worker-2]

See Benjamin Winterberg's blog for a nicely worked example of this.

Now in Java 8 you can use this commonPool directly with the new method on ForkJoinPool commonPool(). This returns an instance of ForkJoinPool (which is an ExecutorService) with the commonPool of threads - the same ones that are used in parallelStream. This means that any work you do directly with the commonPool will play very nicely with work done in parallelStream especially the thread scheduling and work stealing between threads.

Let's work through an example of how you use ForkJoin especially in dealing with the tricky subject of exceptions.

Firstly obtain an instance of the commonPool by calling ForkJoin.commonPool(). You can submit tasks to it using the submit() method. Because we are using Java8 we can pass in lambda expressions which is really neat. As with all ExecutorService implementations you can pass either instances of Runnable or Callable into submit(). When you pass a lambda into the submit method it will automatically turn it into a Runnable or a Callable by inspecting the method signature.

This leads to an interesting problem which highlights how lambdas work. Supposing that you have a method of return type void (like a Runnable) but throws a checked exception (like a Callable). See the method throwException() in the code listing below for such an example. If you write this code it won't compile.

Future task1 = commonPool.submit(() -> {
            throwException("task 1");
        });

The reason for this is that the compiler assumes, because of the void return type, that you are trying to create a Runnable. Of course a Runnable can't throw an Exception. To get around this problem you need to force the compiler to understand that you are creating a Callable which is allowed to throw an Exception using this code trick.

Future task1 = commonPool.submit(() -> {
            throwException("task 1");
            return null;
        });

This is a bit messy but does the job. Arguably, the compiler, could have worked this out itself.

Two more things to highlight in the full code listing below. One, the fact that you can see how many threads are going to be available in the pool using commonPool.getParallelism(). This can be adjusted with the parameter '-Djava.util.concurrent.ForkJoinPool.common.parallelism'. Two, notice how you can unwrap the ExecutionException so that your code can just present an IOException to its callers rather a rather non-specific ExecutionException. Also note that this code fails on the first exception. If you want to collect all the exceptions you would have to structure the code appropriately, possibly returning a List of Exceptions. Or maybe more neatly throwing a custom exception containing a list of underlying exceptions.

public class ForkJoinTest {
    public void run() throws IOException{
        ForkJoinPool commonPool = ForkJoinPool.commonPool();

        Future task1 = commonPool.submit(() -> {
            throwException("task 1");
            return null;
        });
        Future task2 = commonPool.submit(() -> {
            throwException("task 2");
            return null;
        });

        System.out.println("Do something while tasks being " +
                "executed on " + commonPool.getParallelism()
                + " threads");

        try {
            //wait on the result from task2
            task2.get();
            //wait on the result from task1
            task1.get();
        } catch (InterruptedException e) {
            throw new AssertionError(e);
        } catch (ExecutionException e) {
            Throwable innerException = e.getCause();
            if (innerException instanceof RuntimeException) {
                innerException = innerException.getCause();
                if(innerException instanceof IOException){
                    throw (IOException) innerException;
                }
            }
            throw new AssertionError(e);
        }
    }

    public void throwException(String message) throws IOException,
            InterruptedException {
        Thread.sleep(100);
        System.out.println(Thread.currentThread()

            + " throwing IOException");
        throw new IOException("Throw exception for " + message);
    }

    public static void main(String[] args) throws IOException{
        new ForkJoinTest().run();
    }
}

Thursday, 22 January 2015

Book review: Mastering Lambdas: Java Programming in a Multicore World (Maurice Naftalin)

Java 8 with its introduction of lambda expressions is the biggest change to the Java language since it was created some 20 years ago. No escaping from it - if you want to stay relevant as a Java programmer you will need to know about lambda expressions and it is certainly worth the effort to really get to grips with the new semantics and paradigms that Java 8 delivers.

Naftalin's book is actually the 3rd book I've read on the subject. The first 2, whilst reasonably well written, didn't really deliver much more than I could have learnt by reading the numerous tutorials out there on the web. Any Java 8 tutorial will explain how to use Lambdas in your code and if that's all you want then you don't really want to buy a book at all. When I buy a book I expect a lot more than that. I expect a beginning, a middle and an end. And to this expectation Naftalin delivers beautifully. He explains the rationals and motivations of Lambda expressions. Are they just a sprinkling of syntactic sugar? No, he explains, they are much more than that. He sets the scene, as is hinted to in the title of his book 'a multicore world' and explains how Java needs to adapt to continue to hold its own so that it will still be one of the most important languages for the next 20 years.

I've been using Java 8 for the last 6 months and am fairly well aquatinted with the syntax but when I read his description of the syntax I could only wish that I had this book 6 months ago! Naftalin has clearly put a huge amount of effort and consideration into how he delivers his ideas, which is why, perhaps, it wasn't one of the first books on the market to greet Java 8 as it appeared in the early part of last year.

Some books go off on a tangent trying to introduce their readers to functional programming through lambdas. I think that is a mistake. I've been a Java programmer since the language emerged and have no wish whatsoever to change my programming style into a functional style. If I would want to do that I would use a functional language like Erlang etc. I want to continue using Java but to introduce some new concepts that can be delivered through Lambdas. The declarative nature of the Lambdas are extremely nice but we don't have to throw out the proverbial baby with the bath water and ditch everything that is good about OOP. I believe that Naftalin shares this opinion as is evident by the way in which he introduces as to practical uses of lambdas in our code.

In conclusion, I would highly recommend recommend this book to any Java developer that wants to get a real understanding of lambdas. The book is extremely readable, not particularly long (all the best books are short) but will tell you everything you will ever need to know about the subject. If you digest all the information in this short book you will be a far better programmer than someone who has just read a few tutorials on the web!

Java8 Lambdas: Sorting Performance Pitfall EXPLAINED

Written in collaboration with Peter Lawrey.

A few days ago I raised a serious problem with the performance of sorting using the new Java8 declarative style. See blogpost here. In that post I only pointed out the issue but in this post I'm going to go a bit deeper into understanding and explaining the causes of the problem. This will be done by reproducing the issue using the declarative style, and bit by bit modifying the code until we have removed the performance issue and are left with the performance that we would expect using the old style compare.

To recap, we are sorting instances of this class:

private static class MyComparableInt{
        private int a,b,c,d;

        public MyComparableInt(int i) {
            a = i%2;
            b = i%10;
            c = i%1000;
            d = i;
        }

        public int getA() return a;
        public int getB() return b;
        public int getC() return c;
        public int getD() return d;
}

Using the declarative Java 8 style (below) it took ~6s to sort 10m instances:

List mySortedList = myComparableList.stream()
      .sorted(Comparator.comparing(MyComparableInt::getA)
                              .thenComparing(MyComparableInt::getB)
                              .thenComparing(MyComparableInt::getC)
                              .thenComparing(MyComparableInt::getD))
      .collect(Collectors.toList());

Using a custom sorter (below) took ~1.6s to sort 10m instances.
This is the code call to sort:

List mySortedList = myComparableList.stream()
                    .sorted(MyComparableIntSorter.INSTANCE)
                    .collect(Collectors.toList());

Using this custom Comparator:

public enum MyComparableIntSorter 
    implements Comparator<MyComparableInt>{
        INSTANCE;

        @Override
        public int compare(MyComparableInt o1, MyComparableInt o2) {
            int comp = Integer.compare(o1.getA(), o2.getA());
            if(comp==0){
                comp = Integer.compare(o1.getB(), o2.getB());
                if(comp==0){
                    comp = Integer.compare(o1.getC(), o2.getC());
                    if(comp==0){
                        comp = Integer.compare(o1.getD(), o2.getD());
                    }
                }
            }
            return comp;
        }
 }

Let's create a comparing method in our class so we can analyse the code more closely. The reason for the comparing method is to allow us to easily swap implementations but leave the calling code the same.

In all cases this is how the comparing method will be called:

List mySortedList = myComparableList.stream()
                    .sorted(comparing(
                            MyComparableInt::getA,
                            MyComparableInt::getB,
                            MyComparableInt::getC,
                            MyComparableInt::getD))
                    .collect(Collectors.toList());

The first implementation of comparing is pretty much a copy of the one in jdk.

public static <T, U extends Comparable<? super U>> Comparator<T> 
   comparing(
            Function<? super T, ? extends U> ke1,
            Function<? super T, ? extends U> ke2,
            Function<? super T, ? extends U> ke3,
            Function<? super T, ? extends U> ke4)
    {
        return  Comparator.comparing(ke1).thenComparing(ke2)
                  .thenComparing(ke3).thenComparing(ke4);
    }

Not surprisingly this took ~6s to run through the test - but at least we have reproduced the problem and have a basis for moving forward.

Let's look at the flight recording for this test:

As can be seen there are two big issues:
1. A performance issue in the lambda$comparing method
2. Repeatedly calling Integer.valueOf (auto-boxing)

Let's try and deal with the first one which is in the comparing method. At first sight this seems strange because when you look at the code there's not much happening in that method. One thing however that is going on here extensively are virtual table lookups as the code finds the correct implementation of the function. Virtual table lookups are used when there are multiple methods called from a single line of code. We can eliminate this source of latency with the following implementation of comparing. By expanding all of the uses of the Function interface each line can only call one implementation and thus the method is inlined.

public static <T, U extends Comparable<? super U>> Comparator<T> 
       comparing(
            Function<? super T, ? extends U> ke1,
            Function<? super T, ? extends U> ke2,
            Function<? super T, ? extends U> ke3,
            Function<? super T, ? extends U> ke4)
    {
        return  (c1, c2) -> {
            int comp = compare(ke1.apply(c1), ke1.apply(c2));
            if (comp == 0) {
                comp = compare(ke2.apply(c1), ke2.apply(c2));
                if (comp == 0) {
                    comp = compare(ke3.apply(c1), ke3.apply(c2));
                    if (comp == 0) {
                        comp = compare(ke4.apply(c1), ke4.apply(c2));
                    }
                }
            }
            return comp;
        };
    }

By unwinding the method the JIT should be able to inline the method lookup.
Indeed the time almost halves to 3.5s, let's look at the Flight Recording for this run:

When I first saw this I was very surprised because as yet we haven't done any changes to reduce the calls to Integer.valueOf but that percentage has gone right down! What has has actually happened here is that, because of the changes we made to allow inlining, the Integer.valueOf has been inlined and the time taken for the Integer.valueOf is being blamed on the caller (lambda$comparing) which has inlined the callee (Integer.valueOf). This is a common problem in profilers as they can be mistaken as to which method to blame especially when inlining has taken place.

But we know that in the previous Flight Recording Integer.valueOf was highlighted so let's remove that with this implementation of comparing and see if we can reduce the time further.

return  (c1, c2) -> {
    int comp = compare(ke1.applyAsInt(c1), ke1.applyAsInt(c2));
    if (comp == 0) {
        comp = compare(ke2.applyAsInt(c1), ke2.applyAsInt(c2));
        if (comp == 0) {
           comp = compare(ke3.applyAsInt(c1), ke3.applyAsInt(c2));
           if (comp == 0) {
             comp = compare(ke4.applyAsInt(c1), ke4.applyAsInt(c2));
           }
         }
    }
    return comp;
};

With this implementation the time goes right down to 1.6s which is what we could achieve with the custom Comparator.

Let's again look at the flight recording for this run:

All the time is now going in the actual sort methods and not in overhead.

In conclusion we have learnt a couple of interesting things from this investigation:

Using the new Java8 declarative sort will in some cases be up to 4x slower than writing a custom Comparator because of the cost of auto-boxing and virtual table lookups.
FlightRecorder whilst being better than other profilers (see my first blog post on this issue) will still attribute time to the wrong methods especially when inlining is taking place.

[Update] In the comments I was asked by Nitsan to include the source code and to execute the tests as a JMH benchmark. I think that these are excellent improvements to the post. Here's the source code. The results of the JMH tests are below:

Benchmark Mode Cnt Score Error Units

CompTestBenchmark.bmCustomComparator thrpt 20 2598.617 ± 67.634 ops/s

CompTestBenchmark.bmJDKComparator thrpt 20 751.110 ± 14.835 ops/s

CompTestBenchmark.bmNoVTLComparator thrpt 20 1348.750 ± 30.382 ops/s

CompTestBenchmark.bmNoVTLOrAutoBoxingComparator thrpt 20 2202.995 ± 43.673 ops/s

The JMH tests were carried out on a list of 1000 items and confirm the results I saw when I ran the test on 10m items.

Aleksey Shipilev made a few comments (see comments section) and amended the JMH benchmark. It's definitely worth checking out his code changes.

Tuesday, 20 January 2015

Java8 Lambdas Tip - Collect SortedGroupingBy

Java8 introduces a great new feature which easily allows your code to decompose a List of objects into a Map of Lists of objects keyed on a particular attribute.

This is best shown by example.
Let's say you have a list of Books as below:

public class Book{
    String author;
    Date published;
    int copiesSold;
    String catagory;

    public String getAuthor() {
        return author;
    }

    public Date getPublished() {
        return published;
    }

    public int getCopiesSold() {
        return copiesSold;
    }

    public String getCatagory() {
        return catagory;
    }
}

To group them into a map of authors to books used to be a little bit painful but is now a one liner!

Map<String, List<Book>> authorsToBooks = books
       .stream()
       .collect(Collectors.groupingBy(Book::getAuthor));

The only problem with this that you might have is that the default Map implementation returned is a HashMap which of course is unordered and you might well want to order by the key, in this example by the author. Of course, you could always sort the Map in a second step but there's a way to do it in one line.

To fix that let's introduce this static utility function:

public static <T, K extends Comparable<K>> Collector<T, ?, TreeMap<K, List<T>>> 
   sortedGroupingBy(Function<T, K> function) {
        return Collectors.groupingBy(function, 
           TreeMap::new, Collectors.toList());
}

We can call it like this:

Map<String, List<Book>> authorsToBooks = books
       .stream()
       .collect(sortedGroupingBy(Book::getAuthor));

This time the map implementation is a TreeMap which has returned the Map of authors to their books in alphabetical order.

Pages