Thursday, March 10, 2005

Modeling Functions With Side-Effects

In the process of finding a way to model functions and side-effects in such a way that models can be usefully compared and composed, let’s look a little more deeply at mathematical functions and their relationship to software programatic functions.

Eric Weisstein et al, of the Mathworld crew, provide an excellent definition of a function.

This definition does not describe any intermediate states, does not imply sequence and makes no association between the behavior of a mathematical function and time. Mathematical functions are orthogonal to the concept of atomicity as a class. Whether a particular function is atomic depends on the definition of the function.

In particular, it depends on whether the expression which defines the function is a constant expression or makes reference to variables which may be altered by forces outside the scope of the function’s invocation.

In conventional usage, this definition of function is further restricted to demand that a function be a stateless mapping between a domain (the multi-dimensional space of the functions inputs) and a range (the multi-dimensional space of the functions possible output values).

The problem with this approach though is that when you project it from a mathematical domain into a computer software domain, the number of dependencies increases tremendously because of the context within which a function is executed. Implementation language, operating system, physical system resources, etc. all play a role in defining the absolute behavior of the function when evaluated. However, these things are “constant” or common to all functions evaluated within the same environment and do not substantially contribute to defining the critical behaviors that are characteristic of the function.

What is really needed is a way to separately identify explicit and implict dependencies of a function and catagorize them so that a formal comparison between functions with different behaviors and belonging to different contexts can be made.

I’m thinking of defining a term: open/closed with respect to functions. A closed function is one for which all information required for the correct evaluation of the function is provided at the time of invocation—it is reentrant. This term corresponds with the conventional interpretation of the definition of function. An open function is one which relies at least in part on one or more items of state which may vary from invocation to invocation and are not specified as part of the function’s contract. It is a broader interpretation of the definition of function, and acknowledges that side-effects, external dependencies and state may exist within the equation which defines the function.

Considering open functions as a degenerate form of mathematical function which can be transformed into an equivalent closed form is a useful step towards methodically transforming programatic functions into closed mathematic functions upon which the full range of prior art in functional analysis may be performed..

In a conversation a couple of months ago about the nature and limitations of the definition of mathematical functions, Matt Larsen argued that all variables to a function must be part of the domain of a function, thus excluding open functions from the scope of the definition of a mathematical function.

This is consonant with the commonly-accepted interpretation of the definition of function stated above. By implication though, it raises an interesting point which essentially contradicts this limitation. To wit:

By extrapolation, this means that the definition of the function is always part of the function’s domain.

For example:

F(x, y) = 7 à F(x, y, 7) = 7

F(x, y) = c à F(x, y, c) = c

Intuitively, this is correct, since the mapping from a domain to a range provided by a particular function depends as much on the defining equation of the function as it does on the instance values of the domain. From a customary point of view, this distinction could be argued as sophistic and redundant but it does demonstrate that the common form of function is essentially an open function.

In the above example, F(x, y, 7) really is the full and correct statement of the function (for the domain of (x, y), the value of any point in the range is always 7.) However we are quite comfortable with simply stating the function as F(x,y) and taking as separate knowledge that the defining equation is the constant value 7.

It seems logical and consistent then, that we define other categories of separate knowledge, or side-effects associated with the function F(x, y) which may in fact bear on the correct evaluation of it but are not necessarily included within the customary understanding of what is either the domain or the range.

For example, it may be a valid constraint that the range of the function lies within the set of positive integers. At that point, we have

F(x, y, 7, {pos int}).

But we still think of it and annotate it as:

F(x, y) = 7 (7 is a member of {pos int})

We give (x, y) different weight in considering the function F than we do {7, {pos int}}. For all effective purposes, (x, y) are sufficient to fully describe the domain of F.

From the conversation with Matt:



Mack: I'm back on my mathematical functions and side-effects kick again

Matt: Discovering anything interesting?

Mack: I'm thinking about a variant form of function that I call "open" and "closed" functions for lack of the correct terms

Matt: What's the difference?

Mack: If I have a function F(a, b) whose definition is (2a + 3b)c

Mack: That would be an "open" function c is not part of the declared domain, but the range is entirely dependent on the value for 'c' at any moment in time. The involution of that function into F(a, b, c) would be the "closed" form

Matt: I see. Well, you're outside of standard function terminology. That is, you're changing the basic definition of a function. So you might just get to define the terms.

Mack: Okay. In what way does it look to you like I'm changing the definition? What do you feel is the definition to begin with? (I’m starting from this definition of a function.)

Matt: Well, let me write what's in my advanced calc book. I'm not sure it's really a formal definition, but it's a start.

Matt: "If A, B are any two non-empty sets, the a function (or mapping) from A into B is a rule f that associates each a that is an element of A a uniquely determined b that is an element of B.

Matt: So the issue is this: your function maps from a set A which is defined in three variables (a, b, c). Right?

Mack: Well, in my example, I map from set A which is defined by two variables (a, b) but my map equation does not contain only constants

Matt: Sure it does. Your definition is the equation (2a +3b)c. So you're mapping from (a, b, c) into a space with one value.

Mack: That's the closed form

Matt: Right. The definition implies the closed form. So you're expanding the definition.

Mack: My point is to confront the case where the mapping isn't from src->dest pristinely, but from src->dest (within some a priori context)

Mack: The pure definition of function doesn't imply the closed form. It imposes no limits at all on the equation of the function. It is just habitual and customary usage that assumes the limits because without them, it doesn't make a lot of sense for normal applications.

Matt: Ok. So I could believe a definition where you said g(a, b, c) is f(a, b) = (2a + 3b)c where c=7. That's a valid definition, and I would say it fits the definition.

Matt: You could now talk about such definitions where the c=7 clause is less fixed, I suppose.

Mack: Mmm... Not sure about introducing the second function.

Matt: Well, without it, I don't think it fits the definition you're giving.

Mack: But my basic point is if I had said F(a, b) is defined as (2a + 3b)7 you wouldn't have complained at all

Matt: Except that you're not defining a function. What you're doing doesn't match the definition you gave for what a function is.

Mack: but if I say F(a, b) is defined as (2a + 3b)c where c=7, then you instinctively force the function into the form F(a, b, c)

Matt: Right. Because that fits the definition of a function. What you gave first doesn't.

Mack: Why doesn't it? The formal definition of a function says NOTHING about what the equation of it must be.

Matt: Hmmm... So I might be willing to accept what you just wrote as a properly defined function. But that's different from what you wrote at the beginning.

Mack: I can, with a straight face say that it is (2a +3b)i

Mack: where i is the imaginary signifier

Matt: And I disagree. The definition of a function says that you must specify a point in the space you're mapping from. If that space has three dimensions, specifying two does not specify a point.

Mack: Right. I'm mapping from a space where (a, b) are the important dimensions

Matt: But (2a + 3b)c maps from a space with three dimensions.

Mack: Let me give a contrasting example

Mack: G(a, b) is defined as 7 so for all G(a, b) the answer is 7

Mack: That is a perfectly legal function

Matt: Sure. No problem with it. You're maping every part of a two dimensional set to 7.

Mack: Okay, now let me turn that on its head and make it pathological

Mack: Q(a, b) is defined as the random function rnd() or, if you don't like rnd(), then Q(a, b) is defined as infinity

Matt: No problem with that either.

Mack: It becomes a multivalued function per the lit but still a legal function.

Matt: Yes. But you can't say that Q(a,b)=infinity defines a map for the set (a, b, c).

Mack: I'm not saying that. My point here is that the range is no more or less defined relative to the domain in any of those cases. If you want to take the big picture, what I'm implying is that c is infinity or c is seven in those examples

Mack: (a, b, infinity)

Mack: Let me point out why I care about this in the first place. Mathematical functions are typically the model for computer programs (turing, et al). In the purest form of programming (functional programming) this approaches truth but side-effects are the bane of this whole approach and are generally just shrugged at as unquantifiable. However if you are interested in transformation and translation of models, then understanding the context within which the model exists is crucial to your success and that context is largely composed of layers of side-effects.

Mack: So if you can characterize the model as a closed function F(a, b, ...n), you can ask interesting questions about the larger open function that model is a subterm of and begin to do formal coverage analysis of the gaps in context between source and destination and build a bridge between one man’s function and another man’s context

Matt: So you're trying to model side effects as the "c" in your equation?

Mack: Bingo. You get the cookie.

Mack: In reality all computer programs are open functions. You can close them by identifying (c) and pulling it explicitly out into the domain

Matt: Hmmm... Well, you might be able to talk about some kind of incomplete function, like f(a,b) = (2a + 3b)c, where c will be defined at a later date. Not sure if you could get much out of that, but it's possible.

Mack: I thought about that, but it's pretty misleading

Matt: So what you're really doing is defining a function on (a, b, c), but only the part on (a, b) is currently known.

Mack: (c) is defined, it's just unknown. (2a + 3b)i would be more accurate

Matt: I think that's just a notational difference.

Mack: There's also a thought here that (a, b) is what is important. (c) is irrelevant, except that it must exist to permit the function to work. The notational difference loses the distinction between signal and carrier by weighting them all the same. For example, if (c) is the java runtime, then F(a, b) where c=java -> F(a, b) where c=perl is interesting because the function really is F(a, b). But you care about c insomuch as it scopes the work you have to do in conversion, and you are also interested in whether it is possible to do a straight conversion or whether the function must become F(a, b, d) where c=perl because there is no linear transformation possible

Matt: I could see interesting things coming out of talking about such a thing, at least in contexts where you can't determine c (or i, whatever) completely, but you can put bounds on it. Like maybe you know it'll be a positive integer, but you don't know what.

Mack: You are exactly right. My interest is in developing a thought framework for quantifying, characterizing and managing that unknown.

Matt: That seems fruitful to me. But to be within the definitional framework, I think you still have to talk about it in terms of a function of three variables, possibly composed of parts that are of lesser dimensionality, some of which are fixed while others are not.

Mack: I understand why I might want to talk about it that way. What I don't understand is why you think I have to.

Matt: Because otherwise you're not talking about functions. They don't fit the definition. Like, take (2a + 3b) c. What's the set you're mapping from?

Mack: Nothing in that equation defines the mapping set

Matt: Yes it does. It _can't_ be a two dimensional set.

Mack: the mapping set is defined when I give the function definition F(a, b)

Matt: No it doesn't. You can't resolve the function without three values.

Mack: Sure I can. The value may be 17c

Matt: not enough info to evaluate the function.

Mack: I think this argument is hinging around what defines the domain of a function

Matt: I think the def is clear. The domain of the function is the set that the function maps from.

Mack: You're essentially saying that if F(a, b) is dependent on c, then c must be part of F's domain

Matt: I believe that's what the def says. If F is dependent on a, b and c, then the set it maps from is {a, b, c}.

Mack: However, if that were true, then in the case of F(a, b) = 7; 7 must be part of the domain

Mack: Do you consider constant functions to be a special case? How about another case: F(x) = 7x Is seven part of the domain?

Matt: No. It's a constant that is part of the definition.

Mack: F(x) = cx is (c) part of the domain?

Matt: Yes, if it's not constant.

Matt: If you want to specify it some way, that's acceptable. But you've got to specify it.

Matt: That is, you can say F(x) = cx where c=7.

Mack: I think your point in this case is that (c) can either be treated as a constant or a variable wrt the function. If I say cx, then it is being used as a constant and is okay (ie, the function is not evaluating (c)) but if I say c=7 then, I'm asking the function to evaluate c and c must therefore be part of the domain.

Matt: If it's a constant, you've got to define it. Either it's a constant (or you have some definition for it), or it must be part of the domain.

Mack: This is really treacherous water though, trying to find a clean definition,,,

Mack: Because state has to be considered an external variable as well at that poinnt

Matt: Yes. I understand. I think this is something that is perhaps not well defined.




Ontologies and the Design Process

I've been thinking a lot about ontologies and the role they play in developing a canon of thought governing Model-Driven Architectures and power tools for advanced pattern-based software composition and analysis.

Patterns and algorithms can be modeled as functions

Functions exist within ontologies

Composability of functions depends on the orthogonality of the ontology, which can be measured by the average height of a model (actually, I think there is an effect here which tracks in the transformed data of the model, not just the architectural height of it, since models could be infinitely recursive. The question might be, what is the transformation space of data made possible by all possible compositions of models. Is this a derivative ontology, or just a nth order derivative solution space?)

Ontologies can be sparse or freely composable or any degree in between. Sparse composition ontologies can be analyzed for chokepoints.

Any collection of models can be analyzed to discover an ontology that represents the expressive range of compositions which can be constructed from the models.

Ontologies can be compared by examining their axes of freedom and the cardinality of the axes.

-----------------------------------------------------------

A good synonym for ontology might be paradigm. An ontology is a framework within which concept models can be constructed. A paradigm is a point of view, or way of thinking about things.

-----------------------------------------------------------

My current curiosity about ontologies as an instrument for implementing pattern-based design and model-model transformation is related to some thinking I was doing back in 1994. Back then I wasn’t familiar with the term ontology but I was concerned with the information density of what I thought of as design vocabularies.

What I was thinking about then, was a view of how the invention process worked, from idea genesis thru implementation. I visualized initial genesis as a single thing: an inspiration, or a collision of several thoughts to create The Idea. This idea is complete, in that it encompasses the entire concept but it is maximally vague. It is so abstract that many critical facets of the idea are unspecified and the whole is so divorced from the real world as to be not directly implementable.

From this prototypical idea, a tree of specialization descends for an arbitrary number of levels as the idea is broken down and refined and each portion defined with greater precision until finally a number of leaf nodes are reached which exist as tangible real-world implementations of their respective pieces of the idea.

Taken together, this set of leaves is equivalent to the orginal concept in terms of information content but with a much higher degree of clarity. However individually, each leaf contains almost no information. It has reversed the information:concreteness ratio of the original Idea node.

At the time I was thinking about this model, I was interested in exploring the differences between portability and repurposability. People talk vaguely about reusability in software but if inspected in light of the above model of idea genesis, reusability can be seen to have two different aspects. I borrowed the terms portability and repurposeability to differentiate these aspects.

Portability describes how well something can be taken from one context and used in another. In terms of our idea genesis tree, portability is a measure of how much of the top of the tree can be moved from one set of leaves to another different set of leaves. From this it is easy to see why little benefit is gained in many porting scenarios where an entire idea is moved from one context to another, but only the top few layers can be moved and the rest of the layers and all the leaves must be reimplemented.

Repurposeability describes how much benefit is gained when something built for one task is used to perform a different task. In terms of our idea genesis tree, this can be visualized by how much of the bottom of a tree can be moved from one idea tree to another. Since the further down the tree you go, the less content and more concreteness, it is easy to see that the larger the set of fragments that can be repurposed, the greater the benefit. If only leaf nodes can be repurposed, little is gained even though there are a large number of them. However, if several major sub-branches can be repurposed some significant gains can be realized because some actual content is being effectively reused.

I was thinking in Shannon terms: specificity and entropy. My core notion was that the root concept node of the tree had maximum completeness but minimum specificity. Zero entropy, but maximum vagueness. Conversely, a leaf node was concrete but highly entropic.

In my visualization of the tree however, I left out an important factor which is one of the major drivers in the development of design techniques: redundancy. The decomposition of an arbitrary idea into a concrete implementation almost always involves a high degree of redundancy in the tree. Repurposing is the technique most commonly applied to attempt to fold the tree and reduce the number of redundant branches. Developing the ability to measure redundancy in a particular idea genesis tree will give us a way to understand the effectiveness of Repurposing and compare it to the cost.

In the idea genesis tree model, each level is an ontology. Design is an exercise of expressing an idea in an ontology whose conceptual distance from the idea is very small and then mapping the idea thru one or more intermediate ontologies to arrive at a representation which is concrete (implemented).

Design typically proceeds as a combination of two major activities: requirement specification and construction. Requirement specification is essentially the act of bounding the solution space of the design. Construction is the composition of elements in some ontology to model a portion of the solution.

It’s interesting to view these two activities in light of the model of idea genesis. We can see more clearly how they interrelate and how we might create automated tools to assist with the process.

For starters, it is seldom the case that design and requirements are specified in the same ontology so determining whether a given design satisfies the requirements is more art than science. If requirements were given in the same ontology as the solution, or could be transformed to be so, we could perform a rigorous comparison.

Historically, the problem with systems that attempt to provide ontologies for formal specification and validation of design against specification is that any ontology sufficiently orthogonal and comprehensive to specify an arbitrary design is highly entropic. It lies very far down on the idea genesis tree and the activity of specification becomes as laborious and fraught with risk as solution construction itself; essentially doubling the time required to complete a given project.

If we could establish transformational relationships between a number of ontologies, we could specify requirements in the ontologies which provide the highest leverage and then normalize them all thru factoring and transformation into a single ontology (or domain-partitioned set of ontologies) shared in common with the solution for actual comparative validation. This would give a much higher-leverage approach.

In order to be able to do this, we need to be able to talk about some properties of ontologies. Concreteness (how far from implementation) we’ve already talked about but we also need expressiveness. Expressiveness could be broken down into a couple of properties that are easier to measure directly: Conceptual distance, axes of freedom, sparseness, portability and repurposeability.

Tuesday, March 01, 2005

Algorithms, Patterns and Duties

An Algorithm is a concise definition of how a piece of work is to be performed.

The term Pattern is a currently popular way to describe commonly-occuring algorithms in a generalized fashion that can be easily adapted to any problem the pattern is suited to solve. It is an algorithm abstracted almost to the point of being a meta-algorithm.

A Duty is the complement to a Pattern. It describes a problem or piece of work in terms of what is to be accomplished rather than how it should be done. A Duty can be implemented by any number of patterns whose outcome, when applied to the case at hand, is equivalent. Likewise, Patterns can be thought of as being an organized set of duties and the algorithmic relationships between them.

As an example that illustrates these two concepts, consider the list pattern: In the list pattern, one of the things that must be done is that the list of elements must be traversed. Exactly how it is traversed is not important to the definition of what a list pattern is and does; only that some portion of the pattern is responsible for having traversal occur on demand. List traversal is a duty. Since there are any number of ways list traversal can be implemented, the traversal pattern or algorithm that is used to actually implement the traversal duty can be considered separately according to additional criteria without impacting the contractual behavior of the list pattern.

Why is this interesting? Because with this distinction, we take another step towards uniform composability of patterns as problem-solving tools. We can weigh alternate algorithms which are isomorphisms of a particular duty, choosing the best fit for a particular application based on comparison of criteria.