Nouns
An Implementation of J
typedef long I; typedef struct {I k,flag,m,t,c,n,r,s[1];}* A;All objects, whether numeric, literal, or boxed, whether noun, verb, adverb, or conjunction, are represented by arrays. For example, the string 'Cogito, ergo sum.', the atom 1.61803, and the table 11+i.3 4 are represented thus:
k flag m t c n r s[0] ┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐ │ 32│ 0│ 20│CHAR│ 2│ 17│ 1│ 17│Cogi│to, │ergo│ sum│. │ └────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘ k flag m t c n r ┌────┬────┬────┬────┬────┬────┬────┬────┬────┐ │ 28│ 0│ 8│ FL│ 2│ 1│ 0│ 1.61803│ └────┴────┴────┴────┴────┴────┴────┴────┴────┘ k flag m t c n r s[0] s[1] ┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬ │ 36│ 0│ 48│ INT│ 2│ 12│ 2│ 3│ 4│ 11│ 12│ 13│ └────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴ ┬────┬────┬────┬────┬────┬────┬────┬────┬────┐ │ 14│ 15│ 16│ 17│ 18│ 19│ 20│ 21│ 22│ ┴────┴────┴────┴────┴────┴────┴────┴────┴────┘The parts of an array, and macros for manipulating them, are as follows:
GA(xyz,t,n,r,s);creates an array named xyz of type t and rank r, having n atoms and shape s. If the rank is 0, s is ignored; if the rank is 1, again s is ignored, and the shape is set to n. Otherwise, if s is nonzero, GA initializes the shape from the r integers pointed to by s, and if s is 0, the shape is not initialized and must be initialized subsequently. GA returns zero if the array can not be created.
typedef char C; typedef double D; A ces,m,phi; I j,*s,*v; GA(ces,CHAR,17,1,0); memcpy((C*)AV(ces),"Cogito, ergo sum.",(size_t)17); GA(phi,FL,1,0,0); *(D*)AV(phi)=1.61803; GA(m,INT,12,2,0); s=AS(m); *s=3; *(1+s)=4; v=AV(m); for(j=11;23>j;++j)*v++=j;The following utilities (file u.c) and array constants (file i.c) are convenient for working with simple arrays. The frequency of use gives a sense of their utility.
Facility | Freq. | Description | |
sc(I k) | 217 | An integer atom with value k (equivalent to sc4(INT,k)) | |
sc4(I t,I k) | 5 | An atom of type t with 4-byte value k | |
scf(D x) | 26 | A floating point atom with value x | |
scc(C x) | 17 | A literal atom with value c | |
apv(I n,I b,I m) | 96 | The arithmetic progression vector b+m*i.n | |
str(I n,C*s) | 52 | A string (literal list) of length n with value the characters pointed to by s | |
cstr(C*s) | 93 | A string with value the characters in the 0-terminated string s | |
v2(I a,I b) | 72 | The integer vector a,b | |
v1(I k) | 15 | The integer vector ,k | |
vec(I t,I n,void*v) | 81 | A vector of length n of type t, with values pointed to by v | |
zero | 129 | 0 | |
one | 98 | 1 | |
two | 24 | 2 | |
neg1 | 22 | _1 | |
pie | 5 | 1p1 (pi conflicts with C usage) | |
a0j1 | 9 | 0j1 | |
ainf | 14 | _ | |
iv0 | 17 | ,2-2 | |
iv1 | 18 | ,2-1 | |
mtv | 73 | i.0 | |
mtm | 44 | i.0 0 |
Types
If x is an array, its type AT(x) specifies
how the atoms starting at AV(x) are to be interpreted.
In C programming terms, AV(x) must be cast to a
pointer of the appropriate C type:
C Data AT(x) Type Description B01 B Boolean (BOOL has a name conflict) LIT C literal (character; CHAR has a name conflict) INT I integer FL D double (IEEE floating point) CMPX Z complex BOX A boxed XNUM X extended precision integer RAT Q rational number SB01 P sparse boolean SLIT P sparse literal (character) SINT P sparse integer SFL P sparse floating point SCMPX P sparse complex SBOX P sparse boxed VERB V verb ADV V adverb CONJ V conjunction ASGN I assignment MARK I marker SYMB L locale (symbol table) CONW CW control word NAME NM name LPAR I left parenthesis RPAR I right parenthesisFor example, if x is literal and s=(C*)AV(x), then s[i] is character i of x. The C data types in the table are all typedef's found in file jtype.h; the data type V is explained in the Verbs section.
#define NUMERIC (B01+INT+FL+CMPX+XNUM+RAT+SB01+SINT+SFL+SCMPX) #define NOUN (NUMERIC+LIT+SLIT+BOX+SBOX)Then the phrase NUMERIC&AT(x) tests for numeric arrays, and the phrase NOUN&AT(x) tests for nouns. Such comparisons play a key role in the parser.
cvt(t,x) | Convert x to type t; signal error if not possible | |
pcvt(t,x) | Convert x to type t; return x if not possible | |
icvt(t,x) | Convert floating x to INT if the values are in range; otherwise just return x | |
bcvt(t,x) | Convert x to the "lowest" type |
#define R return A aib(A x){A*u,z;I j,*v; GA(z,INT,AN(x),AR(x),AS(x)); /* 1 */ u=(A*)AV(x); v=AV(z); /* 2 */ for(j=0;AN(x)>j;++j)*v++=AN(*u++); /* 3 */ R z; }Line 1 creates an integer array z having the same rank and shape as x. Line 2 initializes pointer values u and v for traversing x and z . Line 3 runs through the atoms of x, through u, and records the number of atoms in each. Since the data type of u is A*, the data type of *u is A and are subject to AN, AT, AV, etc.
Memory Management
When an array is created, malloc is
called to obtain the requisite storage;
when this storage is no longer needed, free is
called to return it to the
underlying system. No "garbage collection" is done. The performance
of this strategy is adequate on modern virtual memory systems.
To facilitate the implementation of alternative strategies,
the use of malloc and free is limited to a single
instance of each, in the file m.c.
The reference count of an array is incremented when it is assigned
a name, directly or indirectly, and is decremented when the name is
reassigned or erased; when the reference count of an array reaches 0,
its storage is freed.
When an array is created, a pointer to it is entered in a "temp stack"
(tstack in file m.c.)
A temp is an array on this stack with a reference count of one.
The temp stack plays an important role in the
main execution loop.
In an iteration of the loop,
The top of the temp stack is recorded;
A line of user-input is executed; and
Temps from the current top-of-stack to the old top-of-stack
recorded above, are freed.
This device permits functions to be written without explicit
memory management code. For example, the monad ~. is written:
F1(jtnub){R repeat(nubsieve(w),w);}And nub need not be concerned with temps used in repeat or nubsieve, because they are accounted for in the main loop.
Global Variables
The only global variables used in the system are
constants which are assigned exactly once.
(For example, the array constant zero and the internal complex
number zeroZ.) All other variables non-local to functions
are accessed through the parameter jt.
jt has defined type J, a pointer to
a struct defined in file jt.h.
Nearly all functions in the system has jt
as its first function argument,
and all such functions have the letters jt as a prefix in their names.
The file ja.h defines aliases for these names, so that a call to a
function jtxyz(jt,a,w,h) is
actually written as xyz(a,w,h).
For example, the conjunction &,
described in Adverbs and Conjunctions,
is implemented by a function
defined and declared as jtamp,
having prototype A jtamp(J jt,A a,A w), but calls to this
function are written as amp(a,w),
and discussions on this function refer to it as amp.
jt makes it possible to execute multiple instances
of J in the same process.