Holy Cow

The Ape of Thoth

A Thelemic Text Daemon

Version 1.5

Do what thou wilt shall be the whole of the Law

[Home] [HowTo] [List] [About]

Consulting the Oracle

The heart of the 'Ape of Thoth' (APE for short) is a CGI script. The script accepts requests in either POST or GET mode. This means that your page can access the script with an HTML form, or directly from the URL (which would be especially useful if you want to provide a search link without the use of a form).

The script and all supporting files reside (for now) in the following place:

http://larabell.org/Hanuman Normally, the script is invoked with a text file name and a command. For example, the following URL will render a random verse from the Book of the Law: http://larabell.org/Hanuman/Hanuman.cgi/liber220/random In this case, the text file is 'liber220' and the command is 'random'.

There is an index.html file in the Hanuman directory which demonstrates the major features of the script. The script is activated with a command after the script name in the URL. Calling the script with no command and no text file name causes the file 'index.html' to be sent to the requestor.

Selecting a range of verses

A sub-set of verses in any given text may be singled out for viewing and/or analysis. To select a range, simply add a colon and a single verse number or a pair of verse numbers separated by a hyphen. For example, if we wanted only verses 5 through 10 of Liber Librae, we could use the URL: http://larabell.org/Hanuman/Hanuman.cgi/liber220:5-10 The range may be specified in any of the following ways:
n Selects the single verse numbered 'n'
-n Selects all verses up to 'n'
n- Selects all verses from 'n' onward
n-m Selects all verses from 'n' to 'm'
Note that if you specify a range for a multi-chapter document or for the 'All Documents' document, the specified range (or as much of that range as exists) will be returned for each chapter or document.

Be aware that the time required to filter the selected verses from the document makes the resulting page load slightly slower so please only use the range selector when you have a good reason to view less than all the available verses of the text.

Links within documents

Each chapter title and each verse number is tagged as a hyperlink. The primary reason for this is to make the verse stand out in the text. For the most part, the links point to the verse itself in the complete document. If you requested a range of verses or searched a document, these links can be used to jump to the selected verse in the complete document.

The chapter and verse links in the individual chapters of multi-chapter texts point to the corresponding locations in the complete text. This allows the user to see an entire chapter 'in context'. The verse link for the random verse selection page (and for any search performed across multiple texts) point to the individual verse in the book or chapter from which it came. To see a more complete selection of text, follow the hyperlink shown for that verse.

The links shown for each word in a concordance page invoke a search for the word selected across the same text used to generate the concordance. The search is performed for whole words only and in the same case sensitivity mode as was the original concordance request.

The one exception to the linking scheme occurs between the documents known as The Book of the Law and The Book of the Inlaws. The latter is a parody of the former and the links in each of these documents point to the corresponding verse in the other. If you follow one of these links and you really wanted the original verse, just select the link in the corresponding verse and you're back in the original document. Neat, eh?

Making GET/POST requests

Additional information is often needed by the script in order to carry out the request. This information may be passed as part of the URL by appending a question mark (?) and a series of fields in any order. The following fields are defined:
input=[filename] Specify the text file to search
search=[searchtext] Specify the word(s) or phrase for which to search
type=[all|any|none|phrase] Specify 'all words', 'any word', 'no words', or 'whole phrases'
case=[on|off] Specify 'case sensitive' or 'case insensitive'
words=[on|off] Specify 'whole words only' or 'any match'
hyphen=[on|off] Split hyphenated words (concordance only)
compact=[on|off] Filters the 'dd' and 'dt' tags for a more compact output
start=[n] Indicates the number of the first verse to scan
end=[n] Indicates the number of the last verse to scan
range=[n-m] Indicates a range of verses to scan (either 'n' or 'm' may be omitted)
Notice that the text file name may be specified as a field. This is so that the specific text to be searched can be selected as part of a form on the referring page. The 'input=[filename]' field is only used when the file is not specified as part of the URL, as in the following example (which will render the same result as the example above): http://larabell.org/Hanuman/Hanuman.cgi/random?input=liber220 Because of the alternate ways in which data may be supplied to the script, conflicts can arise. The following rules are applied when data elements overlap:
  1. If an input text is given in the URL path, it overrides 'input=[filename]' field.
  2. If a 'range=[n-m]' field is given, it overrides both the 'start=[n]' and 'end=[n]' fields.

Commands suported

The simplest command is the 'null' command (that is, no command at all). In this case, the script will return the complete text of the named document: http://larabell.org/Hanuman/Hanuman.cgi/liber220, or http://larabell.org/Hanuman/Hanuman.cgi?input=liber220 Another simple command is the 'random' command. This command returns one verse at random from the selected text: http://larabell.org/Hanuman/Hanuman.cgi/liber220/random, or http://larabell.org/Hanuman/Hanuman.cgi/random?input=liber220 The 'quote' command is similar to the 'random' command except that only the verse itself is returned and the hyperlink has the full URL of the APE Home Page. The only real use for this command is to add a random Holy Book quote to another web page using Server-Side Includes. http://larabell.org/Hanuman/Hanuman.cgi/liber220/quote The 'concord' command produces a list of all the words which occur in the selected text and links each to the search command for that particular word: http://larabell.org/Hanuman/Hanuman.cgi/liber220/concord, or http://larabell.org/Hanuman/Hanuman.cgi/concord?input=liber220 More complex is the 'search' command. This command may be supplied several optional fields to control the search algorithm. If the 'search=' field is not supplied, the effect is the same as with the 'null' command (that is, the entire text is returned). Assuming the search text is 'text' and case sensitivity is turned on you could use: http://larabell.org/Hanuman/Hanuman.cgi/liber220/search?search=text&case=on, or http://larabell.org/Hanuman/Hanuman.cgi/search?input=liber220&search=text&case=on

Documents supported

Following is a list of the documents supported as of the writing of this description. The list is always growing and it pays to check the 'index.html' file to see the latest additions.
liber1 Liber B vel Magi
liber7 Liber Liberi vel Lapidis Lazuli, All
liber7.0 Liber Liberi vel Lapidis Lazuli, Prologue
liber7.I Liber Liberi vel Lapidis Lazuli, Chapter I
liber7.II Liber Liberi vel Lapidis Lazuli, Chapter II
liber7.III Liber Liberi vel Lapidis Lazuli, Chapter III
liber7.IV Liber Liberi vel Lapidis Lazuli, Chapter IV
liber7.V Liber Liberi vel Lapidis Lazuli, Chapter V
liber7.VI Liber Liberi vel Lapidis Lazuli, Chapter VI
liber7.VII Liber Liberi vel Lapidis Lazuli, Chapter VII
liber8 Liber VIII
liber10 Liber Porta Lucis
liber27 Liber Trigrammaton
liber30 Liber Librae
liber61 Liber Causae, All
liber61.P Liber Causae, Preliminary Lection
liber61.H Liber Causae, History Lection
liber64 Liber Israfel
liber65 Liber Cordis Cincti Serpente, All
liber65.I Liber Cordis Cincti Serpente, Chapter I
liber65.II Liber Cordis Cincti Serpente, Chapter II
liber65.III Liber Cordis Cincti Serpente, Chapter III
liber65.IV Liber Cordis Cincti Serpente, Chapter IV
liber65.V Liber Cordis Cincti Serpente, Chapter V
liber66 Liber Stellae Rubeae
liber90 Liber Tzaddi
liber156 Liber Cheth
liber175 Liber Liber Astarte
liber200 Liber Resh
liber220 The Book of the Law, All
liber220.I The Book of the Law, Chapter I
liber220.II The Book of the Law, Chapter II
liber220.III The Book of the Law, Chapter III
liber370 Liber A'ash
liber474 Liber Os Abysmi
liber536 Liber Batrachophrenoboocosmomachia
liber555 Liber Had
liber813 Liber Ararita
liber999 Liber Call Me Al, All
liber999.I Liber Call Me Al, Chapter I
liber999.II Liber Call Me Al, Chapter II
liber999.III Liber Call Me Al, Chapter III
In addition, there are three special 'pseudo-documents' provided. The file called 'liber0' has the complete contents of all the documents supported by the APE. The file 'liber00' has all the documents except Liber Call Me Al and 'liber000' has only the A\A\ Class A documents.

How to link the Oracle to your page

The APE is best accessed via an HTML form. However, the GET method is also supported which means that everything can be specified as part of the URL. However, if you want the user to be able to specify the search string and to control the searching, use a form similar to the following: <form method=get action="http://larabell.org/Hanuman/Hanuman.cgi/search"> Search for: <input NAME="search" size=60 maxlength=60><br> in the text: <select NAME="input"> <option VALUE="liber30">Liber Librae (30) <option VALUE="liber90">Liber Tzaddi (90) <option VALUE="liber220">Liber AL vel Legis (220) </select><br> <input name="type" type=radio value="any" checked>any of the above words <input name="type" type=radio value="all">all of the above words <input name="type" type=radio value="phrase">complete phrase<br> <input name="case" type=checkbox>Case sensitive <input name="words" type=checkbox>Whole words only <input type="submit" value="Search"> <input type="reset" value="Clear"> </form>

File format

When a request is made to the APE, the filename is used to determine which HTML file to open. The extension '.html' is added to the file name passed in the URL and this file is read by the script. The HTML file can have any text or formatting within it. In addition, it should contain one or more 'pseudo-comments' which control where and how the text of the search should be inserted.

HTML files

The typical HTML file will contain at least one 'foreach..end' pseudo-comment pair which specifies the data file to be read: <!-- foreach filename --> ...Modified HTML goes in here... <!-- end --> When the script reads this sequence, it opens the file specified by 'filename' and proceeds to search the verses contained therein according to the options selected. For each matching verse, it emits the HTML code contained in the 'foreach...end' pair, with certain substitutions. For each 4-letter keyword which appears in the format %keyw%, the appropriate information corresponding to the verse in question is printed instead. The possible keywords are:
file the name of the data source file
numb the verse number of the verse
link the name of the data source file, a colon, and the verse number
utag the verse number (or '*' for non-numbered verses)
name the title of the document plus 'utag' above
text the text of the verse itself
word the word from the concordance list (concord only)
case the case sensitivity switch (concord only)
Multiple lines of HTML may be placed between the 'foreach' and the 'end' but both comment tags must appear at the start of a new line and must not be followed by other text or HTML tags.

An additional pseudo-comment may be used to specify HTML text to be emitted when no data file lines satisfy the match. This HTML is not subject to the same substitutions, as there is no verse involved.

<!-- foreach filename --> ...HTML to print a single verse... <!-- else --> ...HTML to print if no verses match... <!-- end --> If the 'filename' is omitted, then the file name originally passed in the URL is used instead. This is important for the 'random' command, which always uses the file 'random.html'. In this case, the actual filename which will be searched is not known when the 'random.html' file is created.

At the top of each HTML file is a

<base HREF="http://larabell.org/Hanuman/"> tag which specifies the URL of the directory in which the APE script and data files reside. This is required in order for the links on the document verses to function correctly and be changed if the script is moved to another location. The data files must remain in the same directory as the script.

Data files

The data files containing the text to be searched must conform to a specific format. Each 'verse' of the text must appear on a separate line and there may be no blank lines in the file. Each line consists of a verse number (or letter), a right paren, and the text of the verse, thus: 1) The text of the first verse 2) The text of the second verse 3) The text of the third verse In order to make the file maintainable, a single 'line' of text may be divided into multiple physical lines by appending the backslash (\) character to the end of each line which is to be continued on the next physical line.

For example, the following example consists of three 'lines' of text:

1) The text \ of the first line 2) The text \ of the second line 3) The text \ of the third line

Include files

A data file may include other data files. This would be used to create a single text database for searching multiple documents without having to duplicate the actual text of the documents. To include the text from another data file, use the following line: #include filename <tag> where 'filename' is the name of the file to be included and 'tag' is the name of the document in human-readable format. Included lines are displayed in a slightly different manner from other text lines. The human-readable 'tag' of the file is prepended onto the verse number in order to identify in which of possibly several documents the verse in question was found.

Access logging

The script has an access log feature built-in. If a file named 'logfile' is present in the directory when the script is run, information regarding the currrent invocation will be appended onto this file. To disable the logging, simply delete or rename the log file.

The script can also notify the owner (or anyone with an e-mail address) of each access. This is to be used primarily for debug. If a file named 'logmail' exists, the file is read and each line is assumed to be an e-mail address. The command and file name of the request are sent to each address listed.

Optimizations

There are two processing optimizations that the script will take, if available. The first involves the removal of the backslash (\) continuation characters from the source files. If the requested file is not found in the directory from which the script is run, but a copy is found in a sub-directory named 'source', the file in 'source' is assumed to contain continuation characters and a one-verse-per-line copy is written into the current directory. This, in practice, saves little in the way of processing time and consumes twice the disk space, since there are then two copies of the data. This may be disabled by moving the sources to the directory containing the script and deleting the 'source' directory. The second optimization saves quite a bit more processing time, but only for accesses of entire documents. Whenever a whole document is accessed (that is, no search/random/concord command and no range specified), the directory named 'compiled' is searched. If the document in question is present, it is returned to the user. If not, the script will write a plain HTML copy of the entire document into that directory and then return it to the user. This may be disabled by deleting the 'compiled' directory.

Using server-side includes

If you want to have a random quote-of-the-day on your web page, you can use a trick called Server-Side Includes (if, of course, your ISP supports them). Essentially, you must make a 'quote' request of the APE, save the output to a file, and include the file in your HTML code. If you have the Lynx browser available on your system, the request is easy: lynx -source http://larabell.org/Hanuman/Hanuman.cgi/liber220/quote will fetch a quote from The Book of the Law. Redirect that to a file and you're half-way there. (If you run the fetch command as a cron job once per day, the quote will change automatically).

Now, assuming you have SSI capability, you can include the following in any '.shtml' file:

<!--#include filename --> Of course, it is also possible to put the lynx command directly into the '.shtml' include tag in order to get a random verse every time you access the page in question. However, please don't do this on high-volume pages, as the APE is runnig under a bandwidth quota and anyone abusing the system will be locked out.

The 'Ape of Thoth' script was written by
Joe Larabell (hanuman@larabell.org).