- Support for variable names and distinguished argument lists.
- Type deduction in opcode generation. In particular, you don't have to distinguish between iadd, ladd, fadd, and dadd when adding two numbers. Likewise, for the other arithmetic operations, loading and storing values in arrays, returning values, casting, etc. By the same token, you don't have to fully specify method signatures.
- The import statement saves you from always having to fully specify java class names. There is no corresponding import opcode for the Java virtual machine, so this is purely a Java language feature.
- The Java language doesn't force you to have to distinguish between different opcodes when loading constants. The Java virtual machine supports the different opcodes to make more compact bytecode. Any high level language (including the ASM bytecode library) should handle this automatically.
- Java provides control flow constructs in the place of goto and conditional jumps. Part of this is the uniform condition system, which prevents you from having to determine which conditional jump opcode to use.
- A uniform call syntax so you don't have to distinguish between invokestatic, invokevirtual, invokeinterface, and invokespecial.
- Built in support for l-values and generalized place forms, so you can use a general assignment form on local variables, array indices, and static and instance fields. You can even set values in multidimensional arrays and the Java compiler will produce the correct combination of opcodes for you for that.
- Last but not least, the Java language automatically handles arguments on the stack for you. This can be useful for example when dealing with mathematical expressions.
Setting up a mixed project:
So in order to set up a mixed Java and Clojure project I suggest using Intellij IDEA. Intellij is the only Java focused IDE that also has good support for Clojure. Then you just need to create separate Java and Clojure folders and configure Leiningen to specify your Clojure folder in :source-paths and your Java folder in :java-source-paths.
Creating a Java class:
In order to create a first example of Java and Clojure interop, I have chosen the special case of defining a prime number sieve. Obviously, you could easily do this in Clojure, but perhaps some mathematical functionality should be written in Java so that they are more performant.
import java.util.BitSet;
public class NumberUtilities {
public static int[] sieve(int n) {
BitSet primes = new BitSet(n+1);
primes.flip(2, n+1);
for(int p = 2; p*p <= n; p++) {
if(primes.get(p)) {
for(int i = p*p; i <= n; i += p) {
primes.set(i, false);
}
}
}
return primes.stream().toArray();
}
}
I mentioned that the Java language is just a tool for generating Java virtual machine bytecodes. Its kind of like an M-expression syntax for the JVM, and Lisp Flavoured Java is an S-expression syntax. Clojure is a different beast entirely from either of them. For the purposes of this demonstration, lets examine the output bytecode as it appears with Jasmin.
.class public NumberUtilities .method public static sieve(I)[I .limit stack 4 .limit locals 4 ; initialize the bit set new java/util/BitSet dup iload_0 iconst_1 iadd invokespecial java/util/BitSet.The advantages of the Java language can clearly be seen by comparing the Java language code to the Java virtual machine bytecode. Whenever someone says that Java is verbose, I just remember how much typing it saves from having to write JVM bytecode in hand, which I have a done a lot. Too much.(I)V ; flip the possible primes to true astore_1 aload_1 iconst_2 iload_0 iconst_1 iadd invokevirtual java/util/BitSet.flip(II)V ; initial the first prime to two iconst_2 istore_2 ; start a loop in order to do the sieve on the main bit set loop: iload_2 iload_2 imul iload_0 if_icmpgt loop_breakpoint ; ensure that this number is a prime before starting the inner loop aload_1 iload_2 invokevirtual java/util/Bitset.get(I)Z ifeq inner_loop_breakpoint ; initialize the current multiple to the first non flipped index iload_2 iload_2 imul istore_3 ; flip all multiples of the current prime to false inner_loop: iload_3 iload_0 if_icmpgt inner_loop_breakpoint ; set the current index to false aload_0 iload_3 iconst_0 invokevirtual java/util/BitSet.set(IZ)V iload_3 iload_2 iadd istore_3 goto inner_loop inner_loop_breakpoint: iinc 2,1 goto loop loop_breakpoint: ; convert the bitset into an int array containing all true indices and return aload_1 invokevirtual java/util/BitSet.stream()Ljava/util/stream/IntStream; invokeinterface java/util/stream/IntStream.toArray()[I areturn .end method
A notable aspect of this is how the compiler structures the output of for loops. There is quite a lot to unpack when using a for loop, and its logic appears all over the place in the compiled output. That is why some parts of the for loop appear before the loop starts, at the start of the loop, and at the end. Once you unpack all of that it is fairly easy to see how Java code corresponds to bytecode. In that sense, Java is one of the easier languages to understand in terms of its compiler output.
Calling Java functions from Clojure
All the countless hours spent reading the documentation of the Java virtual machine, the Java language, and the thousands of classes in the Java standard library are finally rewarded by using Clojure, which has seamless Java interop.
(prn (seq (NumberUtilities/sieve 1000)))
The execution of the sieve function written in Java produces the first prime numbers up to a thousand, which confirms our memory of the smallest primes.
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67
71 73 79 83 89 97 101 103 107 109 113 127 131 137 139
149 151 157 163 167 173 179 181 191 193 197 199 211 223
227 229 233 239 241 251 257 263 269 271 277 281 283 293
307 311 313 317 331 337 347 349 353 359 367 373 379 383
389 397 401 409 419 421 431 433 439 443 449 457 461 463
467 479 487 491 499 503 509 521 523 541 547 557 563 569
571 577 587 593 599 601 607 613 617 619 631 641 643 647
653 659 661 673 677 683 691 701 709 719 727 733 739 743
751 757 761 769 773 787 797 809 811 821 823 827 829 839
853 857 859 863 877 881 883 887 907 911 919 929 937 941
947 953 967 971 977 983 991 997)
Clojure is able to call the Java sieve function because they both speak the same basic language: Java virtual machine bytecode. We saw the JVM output of the Java class. The corresponding Clojure class produces the same sort of opcodes, starting with a call with invokestatic to call the sieve function. The only difference is that Clojure might have to use reflection if not enough type information is provided to it.
In this case that isn't necessary because NumberUtilities doesn't use method overloading, but in general the most important performance benefit you can add to your Clojure code is to use type hints so you don't have to use reflection to determine method signatures at runtime. Finally, after calling sieve we use seq in order to print it, because sequences produce better output when converted to Strings. A more Java like solution would be to use java.util.Arrays/toString.
The Java language and Clojure perfectly complement each other, because Clojure isn't just another copy of Java. Java is static, imperative, heteroiconic, etc while Clojure is dynamic, functional, and homoiconic. The fact that two languages that are so different from one another can come together is the ultimate testament to the power of the Java virtual machine.
No comments:
Post a Comment