I've finally gotten around to uploading my custom Mac OS X keyboard layout. The most significant additions over the standard U.S. keyboard layout are: mathematical symbols (relevant to calculus and symbolic logic), Greek characters, and arrows. It also makes (, ), :, and | unshifted.

One bit of advice sometimes given to the novice programmer is don't ever compare floating-point numbers for equality, the reason being that floating-point calculations are inexact, and one should use a small epsilon, allowable error, instead, e.g. if (abs(value - 1.0) < 0.0001).

This advice is actually wrong, or rather, overly strong. There is a situation in which it is 100% valid to compare floats, and that is an cache or anything else which is comparing a float with, not a specific constant (in which case the epsilon notion is appropriate), but rather a previous value from the same source; floating-point numbers may be approximations of exact arithmetic, but that doesn't mean you won't get the same result from the same inputs.

So, don't get any bright ideas about outlawing aFloat == anotherFloat.

Unfortunately, there's a case in which the common equality on floats isn't what you want for previous-value comparison anyway: for most definitions of ==, NaN ≠≠ NaN. This definition makes sense for numerics (and is conformant to IEEE floating point specifications), because NaN is “not a number”; it's an error marker, provided as an alternative to exceptions (or rather, floating point error signals/traps/whateveryoucallit) which propagates to the end of your calculation rather than aborting it and requiring immediate error handling, which can be advantageous in both code simplicity and efficiency. So if you think about calculating within the space of “numbers”, then NaN is outside of that. But if you're working in the space of “results of calculations”, then you probably want to see NaN == NaN, but that may not be what you get.

Mathematically, the floating-point comparison is not an equivalence relation, because it is not reflexive on NaN.

(It's also typically the case that 0 == -0, even though positive and negative zero are distinct values. Oh, and NaNs carry data, but I'm not talking about that.)

What to do about it, in a few languages:

JavaScript

Even the === operator does not compare identities rather than numeric values, so if you want to compare NaN you have to do it as a special case. Google Caja handles it this way:

/**
 * Are x and y not observably distinguishable?
 */
function identical(x, y) {
  if (x === y) {
    // 0 === -0, but they are not identical
    return x !== 0 || 1/x === 1/y;
  } else {
    // NaN !== NaN, but they are identical.
    // NaNs are the only non-reflexive value, i.e., if x !== x,
    // then x is a NaN.
    return x !== x && y !== y;
  }
}
Common Lisp

The = operator generally follows the IEEE comparison (if the implementation has NaN at all) and the eql operator does the identical-object comparison.

E

The == operator is guaranteed to be reflexive, and return false for distinguishable objects, so it is appropriate for the “cache-like” use cases, and the <=> operator does conventional !(NaN <=> NaN), 0.0 <=> -0.0 floating-point comparison.

Photo taken by me in a Staples store. Text added with Zach Beane's roflbot.

There seems to be a recurring misconception about writing .asd files. It is not necessary to start your asd with (defpackage :foo.system ...) (in-package :foo.system) to “avoid polluting the package asdf loads your system in”.

Every time an .asd file is loaded, ASDF creates a fresh package to load it in. The relevant code from ASDF is:

(defun make-temporary-package ()
  (flet ((try (counter)
           (ignore-errors
                   (make-package (format nil "ASDF~D" counter)
                                 :use '(:cl :asdf)))))
    (do* ((counter 0 (+ counter 1))
          (package (try counter) (try counter)))
         (package package))))

(defun find-system (...)
  ...
      (let ((package (make-temporary-package)))
        (unwind-protect
             (let ((*package* package))
               ...
               (load on-disk))
          (delete-package package)))
  ...)

So, whenever you load an asd file, the package is fresh and :uses CL and ASDF.

There are reasons to define a package for your asdf system definition.

  • If you are creating any classes, such as custom ASDF component types, then, since classes are not garbage collected (MOP specifies that the list of a class's subclasses can be retrieved, and no implementation I know of uses weak references for the purpose), naming them with symbols in a consistent package (rather than the asdf temporary package) ensures that reloading the system definition does not create useless additional copies of the classes.
  • If you are defining functions or other named entities in your system definition (which should of course only be done if they are also necessary to load the system) and want to be able to refer to them, such as for debugging, with nice interned symbol names.
  • If you want to be able to (load "foo.asd") rather than going through asdf:find-system.

The disadvantages I can think of for defining a package for your system definitions are that the package list becomes more cluttered (only noticeably if for some strange reason you're loading many system definitions and not loading the defined systems), you're using more of the global package namespace (probably not a significant concern if you use a naming scheme like foo and foo.system), and your code is marginally more complex.

So there are good reasons for (and maybe some against) defpackage in your asd, but package hygiene is not one of them.

[Edited 2009-11-26 to be less worded against defpackage after consideration and feedback.]

I've been having a lot of thoughts lately I'd like to publish, but seem a little bit too short for A Blog Post. Some options

  1. Considered signing up for Twitter. Pro: short stuff is expected there; participating in the hot new thing; people interested in short-form will be on twitter and using the follow feature. Con: Reliability problems; YA thing to manage credentials and backup for; no hyperlinks; my favorite username is taken.
  2. I could post stuff here. Particularly, LJ supports titleless posts.
  3. I could use some other site, or build my own publishing system. [er, why?]

On reflection, I'm thinking to just post the stuff here and not have yet-another-distinct-place/site.

Readers, what would you prefer?


(This post is tagged “lisp” solely so that Planet Lisp will pick it up.)

Rosetta Code is a wiki where a variety of programming problems and language features are demonstrated in many languages, allowing language learning and comparison of features and paradigms.

If you'd like to contribute Common Lisp code to the project, I've just completed a classification of tasks in CL (like my prior one for E), so that it's easy to find the kind of problem one wants to work on at the moment.

One thing I find interesting about working on Rosetta Code is that many tasks bring with them some particular perspective — someone's notion of how programming necessarily works — and it can be a challenge to figure out the analogous way, or best way, to do it in your language is, and then explain it.

Common Lisp provides compile-file whose purposes is to convert a CL source file into an (implementation-defined) format which usually has precompiled code and is designed for faster loading into a lisp system.

compile-file takes the pathname of a textual lisp source file. But what if you want to compile some Lisp code that's not in a file already, perhaps because you translated/compiled it from some not-amenable-to-the-Lisp-reader input syntax, or because it contains unREADable literals? You can use a "universal Lisp file", which I know two ways to create (use whichever you find cleaner):

(cl:in-package :mypackage)
#.mypackage::*program*
or
(cl:in-package :mypackage)
(macrolet ((it () *program*))
  (it))

Suppose this is in "universal.lisp". Then to use it:

(defvar *program*)

(defun compile-to-file (form output-file)
  (let ((*program* form))
    (compile-file #p"universal.lisp"
                  :output-file output-file)))

This is just a minimal example; you'll also want to appropriately handle the return value from compile-file, provide an appropriate pathname to the universal file, etc. For example, here's an excerpt of the relevant code from E-on-CL, where I have used this technique to compile non-CL sources (emakers) into fasls:

(defparameter +the-asdf-system+ (asdf:find-system :e-on-cl))
(defvar *efasl-program*)
(defvar *efasl-result*)
...
(defun compile-e-to-file (expr output-file fqn-prefix opt-scope)
  ...
  (let* (...
         (*efasl-program*
           `(setf *efasl-result*
                  (lambda (...) ...))))
    (multiple-value-bind (truename warnings-p failure-p)
        (compile-file (merge-pathnames
                        #p"lisp/universal.lisp"
                        (asdf:component-pathname +the-asdf-system+))
                      :output-file output-file
                      :verbose nil
                      :print nil)
      (declare (ignore truename warnings-p))
      (assert (not failure-p) () "Compilation for ~A failed." output-file))))

(defun load-compiled-e (file env)
  (let ((*efasl-result* nil)
        ...)
    (load file :verbose nil :print nil)
    (funcall *efasl-result* env)))

Note that the pathname is computed relative to the ASDF system containing the universal file; also note the use of a variable *efasl-result* to simulate a "return value" from the compiled file, and the use of a lambda to provide a nonempty lexical environment, both of which are features not directly provided by the CL compiled file facility.

Apple's Sampler is a profiler based on the principle of periodically collecting the entire call stack of the executing threads, then summarizing these stacks to show what occurs frequently; primarily, as a tree, rooted at the bottom of the stack, where each node shows the number of times that call sequence was found on the stack.

SBCL's sb-sprof is a profiler which also collects call stacks, but its summary report is much less useful to me as it does not provide the per-branch counting; just top-of-stack frequencies and a caller/callee graph.

Therefore, I examined Sampler's file format and wrote code to generate it from sb-sprof's record.

Screenshot of Sampler with SB-SPROF data

The file is mixed text/binary, LF line endings. The grammar, as far as I've determined it, is:

  "@supersamplerV1.0" LF
  "@symboltableV1.1" LF
  (TAB int32<id> TAB int32<unknown> 
   TAB text<symbol> 
   TAB text<library-path> TAB text<library-path> LF)*
  "@end" LF
  (
    "@threadV1.0" TAB int16Hex<thread-id> LF
    (
      TAB int32<1> int32<0> int32<1> int32<count of stack-frame> (int32<stack-frame>)* LF
    )*
  )*
  "@end" LF

where by "int32" I mean a big-endian 32-bit (unsigned?) integer (i.e. four not-necessarily-ASCII bytes), and by "int16Hex" I mean a 16-bit integer in hexadecimal (i.e. four ASCII bytes).

"id" is an arbitrary identifier for this symbol. "unknown" is occasionally nonzero, but I don't know what it means. "symbol" is the name of a function/method found on the stack. "library-path" is the pathname to the object file it was loaded from (relative in the case of a standard framework, e.g. "Carbon.framework/HIToolbox.framework/HIToolbox").

"thread-id" is an identifier for the thread, which should occur as an "id" in the symbol table; the upper 16 bits evidently must be 0. Thread symbol table entries have a name and library path which is the string ("Thread_" int16<thread-id>); I have not confirmed whether this is necessary.

Each entry in a @thread block is one sampling of the whole stack of that thread. I do not know what the 1, 0, and 1 mean, but the fourth integer is the number of frames on the stack; immediately after are that many integers, each of which is an id from the symbol table.

Files generated from this structure are accepted by Sampler, but not always by Shark; I don't know why, and my attempt at tracking it down made it seem to depend on the size of the trace file.

Here is code to generate such a file from sb-sprof data; it should be loaded in the SB-SPROF package: SB-SPROF to Sampler )

This code generates a noninteractive Sampler-style tree report from SB-SPROF data. SB-SPROF tree report )

I recently found I wanted to retrieve some old Apple IIgs image files. After vague memories and some research, I found that they were “Super Hi-Res” images, which are dumps of screen memory in SHR mode, passed through the PackBytes compression routine.

I haven't found any on-the-web documentation of the SHR layout, but according to books and my successful decoding, the pixel data starts at the beginning of the file, has 160 bytes per row, 200 rows, and no end-of-row data.

After the image data are 200 scanline control bytes (SCBs) and 16 color tables. I haven't looked at decoding these yet.

In 320 mode, each pixel is 4 bits specifying that color in a 16-position color table. In 640 mode, each pixel is 2 bits, specifying ((x-position mod 4) + pixel) in the color table; the default color table has black and white in the same position in each group of 4, and the same colors in the 1st and 3rd, and 2nd and 4th, subgroups of the color table. Thus, any pixel can be black or white (which was used for fonts) and pairs of pixels can be any of 15 distinct dithered colors (there are necessarily two grays).

Here's a program in C to decompress PackBytes format:

unpackbytes.c )

This program, in Common Lisp, will convert an uncompressed SHR image to PPM:

shr.lisp )

The :ww-colors option exists because the particular files I wanted to convert were written by a program for running the WireWorld cellular automaton, which used a custom palette, but wrote its pattern files with the standard palette.

If SBCL reports that the sb-posix contrib module fails to build, it may be a good idea to:

$ sudo chmod o-w /

Projects

Saturday, October 9th, 2004 12:22

Implementing the stream-IO interfaces for E. It's mostly complete, but there's an executing-blocking-code-in-the-wrong-thread bug which Shouldn't Be Happening. I'm inclined to think it's a bug in E, except that the code explicitly does things which break E's reference discipline and I'm not sure that it isn't due to that.


An RDF parsing/processing library in E, which was the subject of the previous post. Stalled because I found the design didn't allow for provenance/contexts in merged graphs.


Recently discovered: Lion Kimbro's OneBigSoup project (weblog, wiki). Lots of ideas here.

Goal, roughly: More interconnection between existing Internet-based social tools.

Pieces I'm finding interesting: LocalNames, UniversalLineInterface

I've been helping Lion with ULI tools (DICT protocol gateway, support for ULI over HTTP GET when appropriate), and talking about capability security principles.

Though I can't imagine any specific use for it yet, I wrote a ULI client on Waterpoint - $local.uli.


The current implementation of E is absurdly slow. It's an interpreter, mostly unoptimized, implemented in Java.

My project: E implemented in Common Lisp—translating E to CL code and implementing ELib with CLOS.

Many Common Lisp implementations compile source to machine code at runtime, and a sufficiently clever translation of E code should benefit from this.

Status: The Kernel-E to CL translator is mostly complete. I have a crude Updoc implementation in CL, which I use for testing the other parts. ELib is only as complete as I've needed to test the translator.

Micro-status: Trying to figure out why var x := 2; x := x returns 0 instead of 2, and yet sets the variable correctly for later code.