Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Programming idioms for nominal adapton #21

Open
matthewhammer opened this issue Feb 22, 2015 · 1 comment
Open

Discussion: Programming idioms for nominal adapton #21

matthewhammer opened this issue Feb 22, 2015 · 1 comment

Comments

@matthewhammer
Copy link
Contributor

High-level Objectives

Let's use mergesort as an example to figure out a general principle for programming with "adaptontic" data-structures and computation.

The goals are eventually to:

  • Define annotation obliviousness as the IC/Adapton analogue of "cache oblivious" algorithm design. The intuitive idea is that annotations (Name and Art) cases come from input structures and propagate, through computation, to output structures. We want our programs to preserve the occurrences and distributions of these annotations in an obvious way (not fine-tuned for each possible frequency).
  • Establish the programming idioms informally. Write more examples, and compare approaches that follow or don't follow our idioms. Find out the simplest idioms that give us annotation obliviousness.
  • Establish the programming principle formally. Ideally, this can be done with a transformation from a known programming model (call-by-value ML with explicit thunks).

Status today

Based on what I'm seeing emerge, I think that these goals are achievable. Here is the general principle that I see developing:

  • Name consumption Name cases come from the input structure. The programmer carries them, monadically, to the output structure.
  • Name production The time for producing a Name varies, but in the case of mergesort, it always corresponds to producing a single element of output (either a One in a rope or a Cons in a list).

These principles work for densely-annotated structures. As it turns out, it is both more natural and more general to write the sparse version, in which the programmer may not have a name handy:

Sparse annotations When annotations are sparse (granularity is coarser), there are runs of Cons cells without intermediate Names and likewise, subropes of Two and One nodes without Names. In these cases, the programmer carries, for each input structure being deconstructed, a Name.t option. The None case signals a chunk of unannotated structure, the Some case indicates that a name (from the input) is ready to be produced in the output structure.

The challenge is consistency consuming and producing names, with following other principles that are needed for correctness and for incremental efficiency:

  • So that names are available across recursion boundaries, each produced named must be carried (recursively) to a potential point of "terminal consumption"
  • Each name can be "terminally consumed" at most once, by done one of two things:
    • Emitting the name in an output structure, via a Name constructor
    • Emitting an articulation in the output structure, via an Art constructor and a call to A.mfn_nart (for some module A).
  • A "non-terminal" consumption of a name forks it, deterministically producing two more names. This is necessary when control-flow splits (e.g., for multiple recursive calls, as in rope_mergesort_improved), or when one needs to emit both a Name and an Art, which is common.

Towards a compositional theory of annotation obliviousness

When composing transformations, we find that their (lazy, nominal) interaction can be complex. In particular, the output structure of one computation becomes the input structure of another, and the "terminal" consumption cases of the first computation become initial sources of names and articulations to the second.

To investigate these compositions further, I propose investigating more hypotheses. For instance:

Example Hypothesis: In rope mergesort, there is a (nearly) preserved relationship between each input element and the Name that (precedes or follows?) it, from the unsorted input to the sorted output.

@kyleheadley
Copy link
Contributor

One point to clarify:

The data representation of SpreadTree.ml discussed above includes separate constructors for Names and Arts. We're currently committed to this, and it might be the best solution, but it leaves us open to poor coordination between Arts. We need to keep in mind that Names are there to act as seeds to later Arts, not as articulation points themselves.

One goal for efficiency is to have the 'articulation boundaries' line up, that is, to have each Art dependent on as few other Arts as possible. The current strategy is to emit an Art for each Name, and to take care to keep them coordinated. This probably works, but it won't be explicit for later programmers who copy our code without understanding what coordination is in place.

I've started Struct.ml with this and other issues in mind, but it can't be tested until issue #22 is complete and a lot of code is rewritten. We should probably hold off on it until after the current paper submission. In the mean time, SpreadTree.ml is more flexible for coordination experiments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants