The Strange Lure of J
The terse world of the APL variant known as J.
It is a good idea for a developer to experience a number of programming languages. With the learning of each new language comes a new competence. For me, after many years of programming in only imperative programming languages, some exposure to OCAML and the F# language taught me lessons that I wish I had known at the start of my career.
Last December I read a document on the J Programming Language. I don’t even remember where I found it but it lead me to the books section of the J Software website and, before I knew it, I was installing the software.
J is a flavour of APL which was developed at IBM in the 1960s. One notable feature of APL was that it used symbols to represent the core functions and operators in the language and this meant that a new vocabulary needed to be learnt even to read APL code.
J was developed in the early 1990s by Ken Iverson and Roger Hui. Ken Iverson was the original inspiration behind APL. J is an array programming language, based on APL, but has the advantage to a programmer of using the ascii character set.
In preserving its similarity to APL, the symbols of APL are represented in J as one or two-character words, most commonly involving punctuation characters, occasionally paired with lower or upper case letters. This has the effect of making J a very terse language. Combine that with the way in which many of the components of the language can be composed with one another to form more complex operations and the language can be a challenge to read - even thought the ASCII character set is used.
For example, the modifier /
can be combined with another verb, say +
to produce a new verb that applies +
between all of its operands. So
+/ 1 2 3 4 5
15
is equivalent to
1 + 2 + 3 + 4 + 5
15
We could combine it with the verb >.
, which returns the larger of its two arguments, so that the new verb >./
is the J equivalent of the function max(x)
where x
is a list of numbers.
3 >. 4
4
4 >. 3
4
So that,
>./ 1 2 3 4 5
5
is equivalent to
1 >. 2 >. 3 >. 4 >. 5
5
I found myself working my way through the introductory Learning J by Roger Stokes and the more challenging J for C Programmers by Henry Rich which is also available as a downloadable PDF here. All the time I needed to make reference to the J NuVoc vocabulary cheat sheet which contains links to full descriptions of each of the monadic and dyadic forms of the various J verbs, adverbs and conjunctions.
The strange thing was that, for the best part of two months, I found it very hard to put these books down. The going was difficult, but there was something very addictive about the challenge of completing a chapter and understanding the lessons and examples in it.
Two months later, I didn’t feel as though I could call myself a J programmer but I did notice that I was able to, at least, visually parse some of the more complex lines of J, and I knew where to look to find the definition of a particular verb.
Grammar
It makes sense to define a little (simplified) J grammar at this point. What we might commonly call a statement in other languages is referred to in J as a sentence. The executable bits of a sentence are called fragments. A noun holds data. A verb operates on one or two nouns to produce another noun. An adverb operates on a verb to produce another verb.
Evaluation of a sentence is the right-to-left evaluation of the fragments of the sentence, with the result of evaluating a fragment replacing the fragment itself and being passed as an operand to the next fragment.
We saw, above, an example of an adverb /
which modified the +
verb to produce a new verb +/
which would sum the argument on its right.
A Simple Program
I decided that it was time to write a simple J program, or, rather, to attempt a challenge of sorts. I had been inspired by Rob Pike’s Ivy - an APL-like calculator, and in particular, by Russ Cox’s YouTube Videos of his solutions to the first fourteen days of the Advent of Code 2021 challenges.
My program was going to be somewhat simpler though. I wanted to see how J would compare with a .NET language like F# for simple list manipulation. So I decided on a simple list of tasks, starting with a function to display the contents of a list.
Program Parts
Write a function,
outArray
that, given a string title and an array of integers as its two parameters, outputs that array as a comma-separated list, preceded by the title, a count of the number of integers in square-brackets and a colon. So called with the parameters"Initial"
and1 2 3 4
it would return:Initial [4]: 1, 2, 3, 4
Assign a small array of ints, say, the array: 31 5 7 6 3 2 8 to the variable
ints
.Use the
outArray
function to output theints
array with the title “Initial”.Create a new list,
sortedInts
, from the originalints
which contains the same list in ascending order, and output it with the outArray function and the title “Sorted”.Create a list,
oddInts
that contains just those ints fromsortedInts
that are odd and output it with the title “Odd”.Create a list
argvInts
that has been read from the command line and output it with the title “argvInts”.Create a list
fileInts
that has been read from a file on the filesystem that contains a single line of comma-separated integers and output it with the title “FileInts”.Combine the
oddInts
,argvInts
, andfileInts
arrays into a single list,combinedInts
, without duplicates and output it with the title “CombinedInts”.Calculate the average of the numbers in the
combinedInts
array and store it in the scalaraverage
.Split the numbers in the
combinedInts
array into two distinct arrays,higher
andlower
such thathigher
contains those ints that are higher than the value inaverage
andlower
contains the remaining ints. Output bothhigher
andlower
with `outArray’.
The OutArray Function
My initial attempt at a function that could output a list of integers in CSV format using J is almost too embarassing to include in this notebook. Let’s just say that it wasn’t quite a simple as this one:
=: 4 : 0
outArray =: (' ' ; ', ') stringreplace (": y)
nums ; x;'[';(":#y);']: '; nums
)
The discovery of the library function stringreplace
made life a lot easier. I have broken the definition up into two lines to make it easier to read but, idiomatic J would probably show it on a single line. Experienced J programmers will probably have to stop reading at this point. I’m sure that a much more elegant implementation would be obvious in hindsight.
Strictly speaking, outArray
is not a function but a dyadic verb. What we might call an operator in most languages, that takes two arguments. It is called with a string title
as its left argument and a list of integers as its right argument.
To understand a J sentence you have to begin from the right-hand-side, as the language has right-to-left precedence for verbs. So the following expression:
2 * 3 + 1
8
evaluates to 8
rather than the 7
we might expect in a language that had a hierarchy of operator precedence in which multiplication was applied before addition.
Another factor that can slightly complicate the reading of a J statement, is that most verbs come in two flavours:
- a monadic form which takes just one argument on the right hand side, and
- a dyadic form which takes two arguments, one on each side.
If a verb has a noun on its left, it is executed as a dyadic verb with a left and right operand. If a verb does not have a noun on itls left, it is executed as monadic with just a right operand.
Monadic and dyadic verbs have distinct definitions and do not even have to perform the same kind of operation. In general, the dyadic form of a verb performs a variation on the operation that the monadic form performs. For example, when used monadically the verb <.
calculates the floor of (greatest integer less than) its operand and when used dyadically it returns the smaller of its two operands.
Even the definition of verbs is syntactically unusual. In the definition above the body of the definition, that gets assigned to outArray
is the part between the fragment 4 : 0
and the line that begins with a )
three lines below. The 4
means that we are defining a dyadic verb and the 0
means that its definition is to be found in the current document up as far as the line that begins with a right-hand round bracket. It makes for difficult reading at first, but you get used to it, it is well defined and there is a keyword form that means the same thing that you can use if you prefer.
In a dyadic verb definition the left-argument is represented by x
and the right-argument by y
.
The first line looks like this:
=: (' ' ; ', ') stringreplace (": y) nums
We start evaluation with the right-most fragment (": y)
which is the verb ":
applied to the value of y (the right-hand argument of outArray
).
The J software website provides a handy cheat sheet, known as NuVoc, which contains links to the definitions of all of the major verbs that are built into J.
From NuVoc we can see that our verb ":
is known as “Default Format” and that it will simply convert our list of integers into a string which contains the integers separated by a space.
So the fragment will be replaced with the string representation of our list and this will become the right-hand argument to stringreplace
.
So if we invoked the outArray
verb with a right-hand argument of
1 2 3
the first line would, after evaluating its right-most fragment, become:
=: (' ' ; ', ') stringreplace '1 2 3' nums
The verb stringreplace
is a J standard library verb. Its left-argument is a two-column matrix of “old” and “new” replacements. Its right argument is the string in which to make the replacements. The resulting string is returned as the value of evaluating the fragment.
After our string replacements the variable nums
will contain a comma separated list of the list of numbers in string format.
1, 2, 3
The second line is less easy on the eye:
; x;'[';(":#y);']: '; nums
The key to understanding it is to know that the dyadic verb ;
(known as Link) is being used to join the following nouns:
- x
- ‘[’
- (“:#y)
- ‘]:’
- nums
The first, x
, is the left-hand argument to outArray
which is the string title we wish to display at the start of the output line.
The second item is a string containing a left-hand square bracket.
The third item has 3 parts:
- The monadic verb
":
(known as Default Format) which converts an integer to its string representation. - The monadic verb
#
(known as Tally) which counts the items in its argument. y
which is the right-hand argument tooutArray
and is our original list of integers.
We evaluate these verbs from right to left which means that we apply Tally to count the items in our integer list first. If, for example, the list was 1 2 3 4
then the count returned by Tally would be 4. We then use Default Format to convert that count to a string because, in J, type conversions do not happen automatically and we want to link string values together with our Link verb.
The fourth item is a string containing a right-hand square bracket, a colon and a space.
The fifth item is nums
our string representation of the integer list.
Our verb, Link, joins all of these together into what is known as a list of boxed items.
Then finally, represented by the first semi-colon character on the line, the monadic verb ;
(known as Raze) removes the level of boxing that was previously applied by Link and runs the results together without spaces into a single string.
Assignment and Output
There are two kinds of assignment verb in J. Local assignment =.
and global assignment =:
. Local assignment can be used to create a noun, adverb or verb that hides any global with the same name for the duration of the containing verb definition or script. On termination of execution of the verb or script the previous value is reinstated.
=: 31 5 7 6 3 2 8
ints 'Initial' outArray ints
[7]: 31, 5, 7, 6, 3, 2, 8 Initial
We use global assignment to create the verb ints
to hold a list of integers. And we output it with a call to our dyadic verb outArray
passing its two arguments on the left and right hand side: 'Initial'
and ints
.
Sort the List
Sorting in J is performed by the built-in verbs /:
and \:
. These are known as Grade Up
and Grade Down
when used monadically and Sort Up
and Sort Down
when used dyadically. To understand Sort Up
you need to first understand Grade Up
.
When applied to a list of atoms (integers in our case), Grade Up
returns the permutation of those atoms that will result in them being sorted into ascending order. The permutation is the list of indexes into the original list that, were items to be selected according to those indexes, would result an ascending list of the atoms. (Indexes in J are 0-based.)
For example, the Grade Up permutation of the list 5 3 4
is the list 1 2 0
:
/: 5 3 4
1 2 0
The permutation tells us to first take the atom at index 1 in the original list (3
), then the atom at index 2 (4
), then the atom at index 0 (5
) - giving us the expected list 3 4 5
.
Used dyadically, the /:
verb is known as Sort Up
. When given arguments x
and y
the result of x /: y
is to sort x
using the order given by y
. In other words, it applies to list x
the permutation that sorts list y
into ascending order.
And so, if x
and y
are the same thing the return value is that of x
sorted into ascending order.
=: ints /: ints
sortedInts 'Sorted' outArray sortedInts
[7]: 2, 3, 5, 6, 7, 8, 31 Sorted
We obtain a sorted list by invoking Sort Up
with our ints
list passed as both left and right arguments. We store the result in sortedInts
for later use.
Select the Odd Integers
We can create a list containing only the odd integers from a list by making use of two J verbs: the dyadic form of the verb #
(known as Copy
) and the dyadic form of the verb |
(known as Residue
).
When called with arguments x
and y
the value of x # y
is a new array in which each integer in x
determines how many times the corresponding item of y
appears. For example:
3 1 0 2 # 6 7 8 9
6 6 6 7 9 9
Resulting in 3 sixes, 1 seven, 0 eights and 2 nines. So, in order to pluck the odd numbers from a list (our y
argument to Copy
) we could use a left argument that contained a one in each position in which an odd number appeared and a zero in each position that an even number appeared.
The verb Residue
will create such a list. The expression x | y
, returns the remainder when dividing an integer in y
by the corresponding integer in x
. If the list x
consists of just a single integer then it will return the remainder of dividing every integer in y
by the single integer in x
. So the expression 2 | y
will return an array, the same length as y
, which contains a one in every position where the corresponding integer in y
is odd and a zero in every position where the corresponding integer is even.
Combining these two and remembering that J has right to left evaluation, and so bracketing to obtain the desired effect, we get:
=: (2 | sortedInts) # sortedInts
oddInts 'Odd' outArray oddInts
[4]: 3, 5, 7, 31 Odd
Read from the Environment
The next part of the program was to read a list of integers from the environment. I’ve not attempted to read from the environment in this notebook but, instead, simulate the environment to show how it would be done.
In a J script, the system variable ARGV_z_
would contain a list of strings (character arrays in J) with the name of the program at the head of the list and the arguments, as character arrays, in the rest of the list.
The assignment to ourArgv
below uses the dyadic form of the verb ;
(Link
) to join a number of strings together into a list of boxed items to simulate the effect of a real ARGV_z_
array.
=: 'programName';'33';'44';'55';'66'
ourArgv =: ". }. > ourArgv
argvInts 'ArgvInts' outArray argvInts
[4]: 33, 44, 55, 66 ArgvInts
This right hand side of the assignment to argvInts
contains three verbs and a noun. The noun ourArgv
is simulating the system noun (ARGV_z_
) which contains a boxed array of the arguments that were provided to the program.
The verb >
does not have a noun to its left and so its monadic form (Open
) is used. The effect of this is to un-box the array of arguments. However, the first atom in this list is always the name of the program, and we want to discard that.
The verb }.
does not have a noun as its left argument so the monadic form of }.
(Behead
) is used. The action of Behead
is to drop the leading item from its argument and return the rest of the array. This has the desired effect of leaving us with an array, each entry of which is itself a character array representing an integer. We can convert these character arrays into integers using the monadic version of ".
(Do
). The effect of Do
is to execute each of the strings, replacing them with the result of the execution which, in this case, is their integer representation.
Read from a File
Working from the right hand side of the expression that is assigned to fileInts
. We read a list of integers from a file in which they are stored (separated by commas). To do this we make use of the monadic form of the system verb freads
. This reads the file whose path is given by the contents of its argument F
and returns it as a string.
This string (in our file) will contain a final linefeed character and so we use the verb }:
(Curtail
) to drop it and return just the file contents.
The dyadic verb ,
(Append
) is used, with the single character ','
as its left argument and our string from the file as its right argument to join the two together - effectively prepending a single comma character to the beginning of the string that has been read from the file.
So far we have a single string containing our integers, each separated with a comma and with a single comma character at the start of the string. We operate on this string using the adverb ;._1
(Cut
). This is one of the family of Cut
adverbs and this one will cut up a string into an array using the first character of the string to indicate the cut points, and removing all of the cut points from the resulting array. The first character of our string is a comma, and so the string gets converted into an array of strings, each representing one of the numbers in our comma-separated-variable file.
Finally, we apply the monadic verb ".
(Do
) to the array of strings to convert it into an array of integers.
=: '/Users/john/j903-user/projects/intarray/fileints.csv'
F =: ".;._1 ',', }: freads F
fileInts 'FileInts' outArray fileInts
[4]: 5, 8, 10, 12 FileInts
We assign the resulting integer array to fileInts
.
Combine without Duplicates
We now need to combine the three arrays: oddInts
, argvInts
and fileInts
into a single array of integers without retaining duplicates.
J makes this very easy for us. We use the dyadic ,
(Append
) to combine the three arrays into a single array. The we use the monadic verb ~.
(Nub
) to remove any item of an array that matches a preceding item - i.e. to remove duplicates.
=: ~. oddInts, argvInts, fileInts
combinedInts 'CombinedInts' outArray combinedInts
[11]: 3, 5, 7, 31, 33, 44, 55, 66, 8, 10, 12 CombinedInts
We assign the resulting list, now free of duplicates, to combinedInts
.
Calculate the Average
Next we need to calculate the average of our combined list.
When three verbs occur one after another in a sequence, (f g h)
and if f
is monadic, g
is dyadic and h
is monadic, this is known as a fork. This pattern occurs commonly enough for it to be recognised as a special case in J. A monadic fork with a single right hand side argument y
, say, has the following interpretation:
(f g h) y
means(f y) g (h y)
So, the noun y
is first supplied to the monadic verb h
, then to the monadic verb f
and then the results of those computations are provided to the dyadic verb g
as its right and left arguments respectively.
To calculate an average from our list of integers we employ a fork made up of the following verbs:
+/
A monadic verb that calculates the sum of the items in its argument.%
(Divide
) A dyadic verb that divides its left argument by its right argument.#
(Tally
). A monadic verb that counts the number of items in its argument.
So, in the assignment to average
below we have a fork (+/ % #)
in which +/
plays the role of f
, Divide
plays the role of g
and Tally
plays the role of h
.
When supplied the argument combinedInts
represented by y
, (f g h) y
becomes:
(+/ % #) combinedInts
Which expands to
(+/ combinedInts) % (# combinedInts)
Or, in plain words, we calculate the sum of the integers in combinedInts
and divide it by a count of the number of integers in combinedInts
which, of course, gives us an average.
=: (+/ % #) combinedInts
average 'Average' outArray average
[1]: 24.9091 Average
Which looks to be of the right size.
Split into Higher and Lower Lists
The last part of the program is to split the combinedInts
array into two distinct arrays with one containing all of the integers that are higher than the average and the other containing the remaining integers.
The dyadic verb >
(Greater
), given an array of integers as its left argument and a number as its right argument produces an array of zeroes and ones of the same length as its left argument and containing a one in every position where the item in the left argument array is greater than the number given as the right argument. So, for example, 3 5 2 > 4
returns the array 0 1 0
.
We can use the conjunction &
(Bond
) to attach a value to the right or left argument of a dyadic verb to produce a monadic verb with a bound argument. So the expression (>&average)
is a monadic verb that returns an array of zeroes and ones according as the items in the array passed as its argument have a value greater than the value of average
.
So if average
has the value 24.9091 then (>&average) 23 24 25
will return 0 0 1
.
As we did earlier, in our calculation of the odd numbers in a list, we can make use of the dyadic #
(Copy
) to select only those items in combinedInts
which are greater in value than average
. We assign those integers to the higher
array.
We then use the -.
verb (Less
) which, when left and right arguments are lists, returns all of the items in the left argument that are not in the right argument.
=: ((>&average) combinedInts) # combinedInts
higher =: combinedInts -. higher
lower 'Higher' outArray higher
'Lower' outArray lower
[5]: 31, 33, 44, 55, 66
Higher[6]: 3, 5, 7, 8, 10, 12 Lower
We now have two lists: One containing all of the integers above average and another containing the rest.
In Summary
It is difficult to understand what I found to be so satisfying about learning and using J:
- I think that partly it was the challenge - a challenge to learn a language that was very different to other languages that I had used.
- I found the clarity of documentation, at www.jsoftware.com, to be of a level that I had only previously found in old Unix, Plan 9 or Inferno documentation.
- It felt like the features of the language were necessary and sufficient to allow a particular problem to be solved.
- I can begin to see how having a terse language can help an experienced reader to visualise a complex algorithm without having to page around in source code.
- There didn’t seem to be a big overlap between the roles of the system verbs. It didn’t feel like there was supposed to be more than one way to do something (I’m reminded of Perl here).
- J has a very mathematical, and precise, feel to it.
- Using J and its ecosystem has helped to reinforce my belief that newer is certainly not always better.
Sadly, it also feels as though J might not be around for ever, and that would be a shame.