kpreid | Entries tagged with data formats

I need to stash this info somewhere; might as well be well-indexed.

Stereo photos such as those taken by the Nintendo 3DS, with a .MPO file extension, are actually JPEG right-eye images with the left-eye image embedded as extra data. They can be interpreted as JPEG files (perhaps after changing the extension to convince your software to read them), and the left-eye image can be extracted with exiftool, as follows:

exiftool input.mpo -mpimage2 -b > L.jpg

A standalone right-eye image without the extra data can be produced with

exiftool -trailer:all= input.mpo -o R.jpg

Source, via.

 +----------------------------------------------------------------------------+
 |                                                                            |
 |                                 A thought                                  |
 |                                 ---------                                  |
 |                                                                            |
 |  |\  /|onospaced text  documents were the first  WYSIWYG-edited documents. |
 |  | \/ | One can create paragraph formatting and headings, special layouts, |
 |  |    | integrated diagrams, decorations --  all interactively with exact  |
 |  immediate feedback  upon just how it  will appear (outside of  choice of  |
 |  font) on screen, paper printout, someone else's email client -- anything  |
 |  whatsoever that displays this medium.  Though,  font characteristics can  |
 |  affect the quality of diagrams and decoration; for example, how high the  |
 |  “~” or “^” symbol is,  and how much of a character cell is filled by the  |
 |  character shape.                                                          |
 |      This, however, is a minor disadvantage,  and I have even today found  |
 |  uses for this style of document preparation;  for example, certain class  |
 |  assignment submissions.  The usual tool is Microsoft Word documents; but  |
 |  I do not own a copy of Word,  and while OpenOffice (dot org) is adequate  |
 |  for reading, its rendering is often different from Word's  -- and I have  |
 |  even encountered data loss bugs: “Hey! Where'd my table go?”  After that  |
 |  incident, I resorted to plain text for the assignment (which was essenti- |
 |  ally tabular in structure) and have used such formatting since for those  |
 |  things which are amenable to the format.  I could have used HTML, or sub- |
 |  mitted a PDF rendered from OpenOffice-on-my-computer or a LaTeX document, |
 |  but I wanted to choose a format  which I was confident would seem a reas- |
 |  onable type of document to the recipient,  and plain monospaced text fit  |
 |  that role well.                                                           |
 |      It can even be fun  to lay out your document  completely by hand, if  |
 |  it is not too long  -- and rather than fiddling with margins,  tab stops, |
 |  table editing tools, etc.,  you can just  *write what you want* directly, |
 |  just as much as if you were writing on paper, with all the advantages of  |
 |  an electronic document. Unless you want graphics that are no bigger than  |
 |  a character.                                                              |
 |                                                                            |
 |   |/    .    /)  . /                                                       |
 |  /\_Ø\///\  /\_Ø/(/.                                                       |
 |                                                                            |
 |  P.S. If you want to reply in like style,  use the <pre>...</pre> element  |
 |  and click the “More Options...” button to get to the “Don't auto-format”  |
 |  option.                                                                   |
 |                                                                            |
 +----------------------------------------------------------------------------+

Something I've wished for several times recently is a database-document program.

By "document" I mean that the database is a single file, which I can move, copy, etc., as opposed to living in a database server which has to stay up, uses accounts and ACLs, needs special backup procedures, and so on. It doesn't need to support humongous data sets — fits-in-memory and even linear searches are fine.

I am aware that people use spreadsheets for such purposes, but I would like to have named, typed, and homogeneous columns, easy sorting/filtering/querying, etc. which I assume I'm not going to find there. Relational would be nice too.

It must be GUI, and run on Mac OS X, but it doesn't have to be thoroughly native — I can stand the better sort of Java or perhaps even X11 app.

And finally, it should have a file format that either is obvious how to parse, or has a specification, or is supported by many other programs.

Does such a thing exist?

(If not, I might write it.)

“Language-independent” just means they invented a new language.

Apple's Sampler is a profiler based on the principle of periodically collecting the entire call stack of the executing threads, then summarizing these stacks to show what occurs frequently; primarily, as a tree, rooted at the bottom of the stack, where each node shows the number of times that call sequence was found on the stack.

SBCL's sb-sprof is a profiler which also collects call stacks, but its summary report is much less useful to me as it does not provide the per-branch counting; just top-of-stack frequencies and a caller/callee graph.

Therefore, I examined Sampler's file format and wrote code to generate it from sb-sprof's record.

The file is mixed text/binary, LF line endings. The grammar, as far as I've determined it, is:

  "@supersamplerV1.0" LF
  "@symboltableV1.1" LF
  (TAB int32<id> TAB int32<unknown> 
   TAB text<symbol> 
   TAB text<library-path> TAB text<library-path> LF)*
  "@end" LF
  (
    "@threadV1.0" TAB int16Hex<thread-id> LF
    (
      TAB int32<1> int32<0> int32<1> int32<count of stack-frame> (int32<stack-frame>)* LF
    )*
  )*
  "@end" LF

where by "int32" I mean a big-endian 32-bit (unsigned?) integer (i.e. four not-necessarily-ASCII bytes), and by "int16Hex" I mean a 16-bit integer in hexadecimal (i.e. four ASCII bytes).

"id" is an arbitrary identifier for this symbol. "unknown" is occasionally nonzero, but I don't know what it means. "symbol" is the name of a function/method found on the stack. "library-path" is the pathname to the object file it was loaded from (relative in the case of a standard framework, e.g. "Carbon.framework/HIToolbox.framework/HIToolbox").

"thread-id" is an identifier for the thread, which should occur as an "id" in the symbol table; the upper 16 bits evidently must be 0. Thread symbol table entries have a name and library path which is the string ("Thread_" int16<thread-id>); I have not confirmed whether this is necessary.

Each entry in a @thread block is one sampling of the whole stack of that thread. I do not know what the 1, 0, and 1 mean, but the fourth integer is the number of frames on the stack; immediately after are that many integers, each of which is an id from the symbol table.

Files generated from this structure are accepted by Sampler, but not always by Shark; I don't know why, and my attempt at tracking it down made it seem to depend on the size of the trace file.

Here is code to generate such a file from sb-sprof data; it should be loaded in the SB-SPROF package: ( SB-SPROF to Sampler )

This code generates a noninteractive Sampler-style tree report from SB-SPROF data. ( SB-SPROF tree report )

plist.py 1.2 is now available, adding support for boolean values. It is a Python module which converts between Mac OS X XML property lists and Python data structures.

Thanks to Robert White for the addition.

When I used the Apple IIgs, I played with an implementation of an extension of the WireWorld cellular automaton.

My previous entry contains the programs I recently wrote to decode my old pattern files. (If you read it earlier: I've now updated it to include a full SHR converter.)

These were originally created in 1996-1998, except for “Sample.Fix”, which is a trivial modification of a pattern which came with the program.

The extensions are rather complex in the details, but essentially, green cells are crossovers (horizontally moving electrons don't interact with vertical wires, etc.) and cyan cells are switches (electrons arriving on the wire that ends at the switch break or join the wire that (typically) crosses the switch).

Larger views.

Adder

( More patterns )

I recently found I wanted to retrieve some old Apple IIgs image files. After vague memories and some research, I found that they were “Super Hi-Res” images, which are dumps of screen memory in SHR mode, passed through the PackBytes compression routine.

I haven't found any on-the-web documentation of the SHR layout, but according to books and my successful decoding, the pixel data starts at the beginning of the file, has 160 bytes per row, 200 rows, and no end-of-row data.

After the image data are 200 scanline control bytes (SCBs) and 16 color tables. I haven't looked at decoding these yet.

In 320 mode, each pixel is 4 bits specifying that color in a 16-position color table. In 640 mode, each pixel is 2 bits, specifying ((x-position mod 4) + pixel) in the color table; the default color table has black and white in the same position in each group of 4, and the same colors in the 1st and 3rd, and 2nd and 4th, subgroups of the color table. Thus, any pixel can be black or white (which was used for fonts) and pairs of pixels can be any of 15 distinct dithered colors (there are necessarily two grays).

Here's a program in C to decompress PackBytes format:

( unpackbytes.c )

This program, in Common Lisp, will convert an uncompressed SHR image to PPM:

( shr.lisp )

The :ww-colors option exists because the particular files I wanted to convert were written by a program for running the WireWorld cellular automaton, which used a custom palette, but wrote its pattern files with the standard palette.

Profile

Kevin Reid

My Website

Navigation

August 2023

S	M	T	W	T	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Kevin Reid's blog

Entries tagged with data formats

How to extract separate images from a 3D “.MPO” photo

On the virtues of monospace formatting

Database document software?

(no subject)

Apple's Sampler file format, and SBCL SB-SPROF report generation

plist.py 1.2

Old WireWorld

Apple IIGS SHR and UnPackBytes

Profile

Navigation

August 2023

Tags

Feeds