An Implementation of J -- Verbs

Anatomy of a Verb
Rank
Atomic (Scalar) Verbs
Obverses, Identities, and Variants
Error Handling

Anatomy of a Verb

Verbs are implemented as functions. A verb applies to a noun (if used monadically) or to two nouns (if used dyadically), and produces a noun result. The defined type AF and the macros F1 and F2 codify these properties:

   typedef A(*AF)();

   #define F1(f)  A f(J jt,    A w); 
   #define F2(f)  A f(J jt,A a,A w);

AF is the data type of a function having these properties. F1 and F2 are used to specify the headers of functions implementing verbs (and adverbs and conjunctions); the majority of functions in the implementation are so specified. (jt is the global variables parameter; a and w denote a and w, traditionally the names given to the left and right arguments of APL functions.) Verbs are represented by arrays of type VERB; the details of this representation are discussed in Adverbs and Conjunctions.

The verb j. is used here to illustrate the relationship among relevant system components. j. has monad 0j1&*"0 and dyad (+j.)"0. There are three main steps in the implementation:

1. Define and declare functions which implement the monad and the dyad.
2. Associate j. with the functions and other information.
3. Specify obverses, identity functions, and variants (if any).

These steps are executed as follows:

1. Functions which implement the monad and the dyad j. are added to file vm.c (or to one of the v*.c files), and declarations are added to je.h:

	File vm.c		File je.h
	`F1(jtjdot1){R tymes(a0j1,w);}`		`extern F1(jtjdot1);`
	`F2(jtjdot2){R plus(a,tymes(a0j1,w));}`		`extern F2(jtjdot2);`

2. The association between j. and jdot1 and jdot2 is established in the table pst, initialized by functions pinit and pdef in file t.c. pst is declared as A pst[256]; , a 256-element vector of type A, and pst[x] contains the information for the primitive whose ID is unsigned byte x. The ID for j. is CJDOT, therefore the information for j. can be found in pst[(UC)CJDOT]. The surrounding entries in pst are initialized as follows; the entry for j. indicates that it is a verb, with monad jdot1, dyad jdot2, and zero monadic, left, and right ranks:

   /*  i: */  pdef(CICO,    VERB, jtjico1,   jtjico2,  0L,  RMAX,RMAX);
   /*  I. */  pdef(CICAP,   ADV,  jticap,    0L,       0L,  0L,  0L  );
   /*  j. */  pdef(CJDOT,   VERB, jtjdot1,   jtjdot2,  0L,  0L,  0L  );
   /*  L. */  pdef(CLDOT,   VERB, jtlevel1,  0L,       RMAX,0L,  0L  );
   /*  L: */  pdef(CLCO,    CONJ, 0L,        jtlco,    0L,  0L,  0L  );

The macro ds(x) is defined as pst[(UC)(x)], and is a convenient reference for a primitive; for example, ds(CJDOT) is the verb j. as an array (in short, ds(CJDOT) is j.).

3. A verb may have information additional to that in pst, embodied in functions inv and invamp (obverses) in file ai.c, iden (identity functions) in ai.c, and fit (variants) in cf.c. See Obverse, Identities, and Variants.

The obverses associated with j. are:

   j.       %&0j1          
   n&j.     %&0j1@(-&n) 
   j.&n     -&(j.n)

The obverse of j. is implemented as case CJDOT in inv; those for n&j. and j.&n are implemented as case CJDOT in invamp. The identity function for j. is $&0@(}.@$), and is implemented as case CJDOT in iden. j. has no variants; the implementation of a variant would have required a case in fit.

Rank

Verb (function) rank was introduced by Iverson [1978 §6], further developed in Iverson [1983, 1987, 1995], and implemented in SHARP APL, SHARP APL/HP, SAX, A, and J (see Bernecky et al. [1983, 1987], Hodgkinson [1986], Steinbrook [1986], Whitney [1989], and Hui et al. [1990], respectively). This description first appeared in Hui [1995].

A verb of rank r is defined on arguments with rank bounded by r; the extension to higher-rank arguments is the same for all verbs. The rank conjunction " (operator) augments the default ranks of a verb by user-specified ranks. It provides for the generalization of a verb to higher-rank arrays, and could justifiably be called the generalization or extension operator; it also provides for consistent application to lower-rank arrays, subsuming and superseding the anomalous bracket-axis operator.

Various aspects of rank are here discussed in terms of a model in J, updated from Hui [1987 §A.2].

Frames and Cells. A rank r splits the argument shape into the frame and the cell shape; a positive r specifies the number of trailing cell axes, while a negative r specifies the negative of the number of leading frame axes.

rk    =: #@$
er    =: (0:>.(+rk))`(<.rk) @. (0:<:[)
fr    =: -@er }. $@]
cs    =: -@er {. $@]
boxr  =: ]`(<@$ , [ $: */@[}.])@.(*@#@])
cells =: fr $ cs boxr ,@]

For rank r and argument y, the phrase r er y computes the effective rank (non-negative and bounded by #$y); r fr y computes the frame and r cs y the cell shape; and r cells y computes the array of cells with shape r fr y, each cell individually boxed and shaped s=: r cs y (r cells y is <"r y). The recursively-defined verb s boxr y produces the list of such cells.

The model is shown in action on x*"0 _1 y, the atoms (scalars) of x times the items of y:

   x=:1 2 3
   y=:i.3 2
   y                        x*"0 _1 y
0 1                      0  1
2 3                      4  6
4 5                      12 15
   0 er x                   _1 er y
0                        1
   0 fr x                   _1 fr y
3                        3
   0 cs x                   _1 cs y
                         2
   0 cells x                _1 cells y
┌─┬─┬─┐                  ┌───┬───┬───┐
│1│2│3│                  │0 1│2 3│4 5│
└─┴─┴─┘                  └───┴───┴───┘

Agreement. In the dyad v"r, commonly the left and right frames match, that is, the two cell arrays have the same shape; if not, several design choices are possible: In scalar agreement, one frame must be empty, and the single cell is reshaped using the other frame; in suffix agreement, one frame must be a suffix of the other, and again the list of cells is reshaped using the other frame; finally, in prefix agreement, one frame must be a prefix of the other, and each cell is reshaped with the excess in the other frame. All three agreements are proper generalizations of scalar extension in APL\360, with cells acting the role of scalars. Agreement results in the two cell arrays having the same shape ("the frame").

Prefix agreement is adopted in J as suggested by Whitney [1992], because it best fits the emphasis on leading axes.

pfx   =: <.&rk
agree =: (pfx {. $@[) -: (pfx {. $@])
frame =: [:`($@([^:(>&rk))) @. agree
rag   =: frame $ ([: */ rk@]}.$@[) # ,@]
lag   =: rag~

rag and lag apply to both cell arrays (the results of cells in the previous section), producing cell arrays with the same shape. If v"r itself were used in the model, rag could be defined more directly from the specification: (rk@]}.$@[) $"1 0 ] — each cell is reshaped with the excess in the other frame. In the continuing example, rag and lag have no effect because the left and right frames match.

   [xc=.0 cells x           [yc=._1 cells y
┌─┬─┬─┐                  ┌───┬───┬───┐
│1│2│3│                  │0 1│2 3│4 5│
└─┴─┴─┘                  └───┴───┴───┘
   [xa=.xc lag yc           [ya=.xc rag yc
┌─┬─┬─┐                  ┌───┬───┬───┐
│1│2│3│                  │0 1│2 3│4 5│
└─┴─┴─┘                  └───┴───┴───┘

Assembly. After agreement, the phrase v&.> applies v under > to corresponding boxed left and right argument cells, to produce an array of boxed result cells. It remains to assemble the overall result from the individual results.

Cells are brought to a common rank by adding leading unit axes, then to a common shape by padding. The overall shape is fm,sir, where fm is the frame and sir is the common shape of the individual results. This is a design choice: the individual results could be required to have a common shape without further intervention, but this permissive assembly proves useful. For example, open > on a list of boxed words yields a matrix with the words padded to a common length.

mrk   =: >./@:(rk&>)@,
crank =: mrk ,:@]^:(-rk)&.> ]
msh   =: >./@:( $&>)@,
cshape=: <@msh {.&.> ]
asm   =: > @ cshape @ crank
rank  =: 2 : 0
 'm l r'=.3&$&.|.y.
 ([: asm [: x.&.> m&cells) : ([: asm l&cells@[ (lag x.&.> rag) r&cells@])
)

The conjunction rank integrates the model components. The left argument x. is the verb v; the right argument y. is reshaped from the right to exactly 3 numbers and assigned to m, l, and r.

   [ za=. xa *&.> ya
┌───┬───┬─────┐
│0 1│4 6│12 15│
└───┴───┴─────┘
   asm za
 0  1
 4  6
12 15
   x * rank 0 _1 y
 0  1
 4  6
12 15

Zero Frame. If the frame contains 0 (as in 3*"1 i.0 4), there are no argument cells to apply v to, and the shape of a result cell (the value of sir) is indeterminate. Pesch [1986] describes a variety of strategies to address this problem. In J, the shape is calculated if v is uniform (see below); otherwise v is applied to a cell of fills.

Implementation. Rank is implemented by functions rank1ex and rank2ex ("rank execution") in file cr.c. A function f has access to the entire arguments of the verb that it implements, regardless of the ranks of the verb. Within f, rank effects can be achieved by invoking rank1ex and rank2ex, mediated by the macros F1RANK and F2RANK:

   A rank1ex(    A w,A self,I m,    AF f1);
   A rank2ex(A a,A w,A self,I l,I r,AF f2);

   F1RANK(m,  f1,self);
   F2RANK(l,r,f2,self);

a and w are the left and right arguments of the verb; f1 and f2 are functions which implement the monad and dyad; m,l,r are ranks; and self is an array representing the verb. For example, the dyad ": has ranks 1 _ and is implemented by the function thorn2, which uses F2RANK as follows:

   F2(jtthorn2){PROLOG;A da,ea,h,ma,s,y,*yv,z;B e,*ev; ...
    F2RANK(1,RMAX,jtthorn2,0);
    an=AN(a); t=AT(w);
    ...
   }

If the argument ranks are not greater than the verb ranks, then F2RANK (F1RANK) does nothing, and execution proceeds to the statement following the macro; if the argument ranks are greater, then F2RANK (F1RANK) invokes rank2ex (rank1ex), and on return therefrom exits f with the result obtained therefrom. In this scheme, rank2ex (rank1ex) invokes f repeatedly, but with arguments of rank bounded by the verb ranks.

A function may implement rank by other means. For example, the dyad { has ranks 0 _ and is implemented by the function from, which eschews rank2ex on numeric left arguments wherein rank effects are uniform and rather simple. (from does use rank2ex on boxed left arguments.) Atomic verbs also implement rank independently to exploit the special properties of such verbs.

Verbs derived from adverbs and conjunctions are always invoked with self. The macros PREF1 and PREF2 are used in such cases, wherein rank1ex and rank2ex are invoked with ranks extracted from self, and not with constants as in the use of F1RANK and F2RANK for primitive verbs.

Atomic (Scalar) Verbs

Not Yet Available

Obverses, Identities, and Variants

Verbs have additional parts — obverse, identity, and variants — which can not be specified as static data structures. Such information is embodied in functions.

• Obverses

A verb u is an obverse (usually the inverse) of a verb v if x=u v x for a significant subdomain of v. The obverse is used in the conjunctions under (&.) and power (^:). For example, exponential ^ and logarithm ^. are obverses, and:

   3 +&.^. 4  is  ^ (^.3) + ^.4        ^ ^:_1  is  ^.
   3 *&.^  4  is  ^.(^ 3) * ^ 4        ^.^:_1  is  ^

Obverses are produced by the function inv in file ai.c. (inv implements ^:_1.) The logic is a combination of table look-up and nested branch tables (switch/case).

Primitives. If the obverse of a primitive verb is itself primitive, the information is recorded in the 2-row table invf in file ai.c.

Bonded Verbs. Bonding (Currying) is fixing an argument of a dyad to derive a monad: n&v or v&n. For example, 10&^. is base-10 log and ^&0.5 is square root. The obverse of a bonded verb is computed by the subfunction invamp in file ai.c, invoked by inv as appropriate.

Prefix and Suffix. Sum prefix +/\ and sum suffix +/\. can be expressed as pre-multiplication by matrices obtained by applying +/\ and +/\. on the identity matrix; the obverse is therefore pre-multiplication by the matrix inverse of these matrices. (The actual obverse is a more efficient equivalent derived therefrom.) Similar reasoning applies to -, *, %, and to = and ~: on Boolean arguments. The logic is embodied as a sub-switch in inv, under case CBSLASH and case CBSDOT.

Reflex (~). The monad v~ computes y v y; for example, +~ is double. The obverses of a few such verbs are implemented by a sub-switch in inv, under case CTILDE.

Assigned Obverse. A verb may be assigned an obverse with the obverse conjunction (:.). f=: u :.v is like u but the obverse of f is v.

Other Verbs. inv applies to a few other verbs, including u@v and u&v, whose obverses are (v inv)@(u inv) and (v inv)&(u inv).

• Identities

u/y applies the dyad between the items of y. When y has zero items, the result of u/y obtains by applying to y the identity function ui of u, so-called because (iu y) u y or y u (iu y) is y for a significant subdomain of u.

Identity functions are computed by function iden in file ai.c. iden behaves like an adverb, applying to verbs and producing verbs. The logic is implemented as a branch table (switch/case). Not all verbs have identity functions; iden signals error in such cases.

• Variants

Variants of a verb are produced by the fit conjunction !., and are used to effect tolerant comparison (= < <. and so forth), formatting to a specific precision (":), shifts (|.), and factorial polynomials (^).

!. is implemented by function fit in file cf.c. The logic is implemented as a branch table (switch/case). Not all verbs have variants; fit signals error in such cases.

Error Handling

When an error is encountered in a function, the global variable jerr is set to an error number, and zero is returned. Therefore, when calling a function that can not have zero as a valid result (but does return a result), the returned value must be checked for zero; when calling a "void" function or one whose range includes zero, jerr must be inspected.

Error numbers range between 1 and NEVM, and are referenced by the EV* names (file jerr.h). The function jsignal (d.c) applies to an error number, sets jerr to this number, and displays the appropriate error message; jsignal exits immediately if jerr is already nonzero. evm is a list of the error messages. These messages are initialized by function evinit (file i.c), and may be inspected and changed by the user through 9!:8 and 9!:9.

The macro ASSERT (file j.h) is used extensively in argument validation. It applies to a proposition and an error number. For example, the following statements check whether w is a literal atom:

   ASSERT(!AR(w),EVRANK);
   ASSERT(LIT&AT(w),EVDOMAIN);

If the proposition is nonzero, execution proceeds to the next statement; otherwise, the indicated error is jsignal-ed and a zero is returned. The macros RZ and RE (file j.h) are used in function calls. RZ returns zero if its argument is zero; RE evaluates its argument, and returns zero if jerr is nonzero. For example, the function iota (implementing the monad i.) exploits RZ as follows:

   F1(jtiota){A z;I m,n,*v;
    F1RANK(1,jtiota,0);
    if(AT(w)&XNUM+RAT)R cvt(XNUM,iota(vi(w)));
    RZ(w=vi(w)); n=AN(w); v=AV(w);
    if(1==n){m=*v; R 0>m?apv(-m,-m-1,-1L):apv(m,0L,1L);}
    m=prod(n,v); z=reshape(mag(w),apv(ABS(m),0L,1L));
    DO(n, if(0>v[i])z=irs1(z,0L,n-i,jtreverse););
    R z;
   }

The arguments of a function may be the result of another function; the convention is that a function checks its arguments for zero and returns zero immediately in such cases. Thus, in iota above:

   z=reshape(mag(w),apv(ABS(m),0L,1L));

If reshape did not check for zero arguments, the statement would have to be elaborated:

   RZ(t0=mag(w));
   RZ(t1=apv(ABS(m),0L,1L));
   z=reshape(t0,t1);

A conventional function is a function that follows the conventions described herein — return zero on zero arguments and on errors. The defined type AF (file jtype.h) typifies a conventional function. Most functions in the system are conventional; in particular, all functions implementing primitives are conventional. Expressions and statements that use only conventional functions need not employ RZ or RE, and the resulting programs are neater. For example, consider functions shape and nub (file v.c), implementing the monads $ and ~., respectively:

   F1(jtshape){RZ(w); R vec(INT,AR(w),AS(w));}
   F1(jtnub){R repeat(nubsieve(w),w);}

shape must check for zero arguments RZ(w), because it applies the unconventional macros AR and AS to the argument w. In contrast, nub applies only conventional functions to its argument and to results of conventional functions on that argument.

Next • Previous • Index • Table of Contents