Good Math/Bad Math

Wednesday, May 31, 2006

Improbable is not impossible, redux

I actually decided to say a bit more about the concept of improbable vs. impossible after my last post.

As I've said before, the fundamental idea behind these arguments is what I call the "big numbers" principle: We (meaning human beings) are very bad at really grasping the meaning of very large numbers. It's hard to really grasp what a million is: if you counted one number a second, with no breaks of any kind, it would take you more than 11 days to count to one million. When we start getting to numbers that dwarf a million, our ability to really understand what they mean - to really grasp on a deep, intuitive level what they mean - just completely goes out the window.

What does 10^20 mean? If it takes 11 days to count to one million, then it would take roughly 11 times 10^14 days - or 3 times 10^12 (aka 3 trillion) years to count to 10^20. That just doesn't mean anything on an intuitive level. So what does 10^100 mean?

What happens when we start to look at probability? Well, it's really easy to get huge numbers in probability calculations - the kinds of numbers that we can't really grasp. And because those numbers are so meaningless to us, so utterly beyond the reach of what we really comprehend, the probabilities of astonishingly common events can seem like they're entirely beyond the realm of possibility: they must be impossible, because how could anything so easily comprehensible create numbers so incomprehensible?

Who hasn't shuffled cards? Who hasn't played a game of klondike solitaire? And yet, every time we do that, we're witnessing an event with a probability of roughly 1 in 8 times 10^67!

If something as common as shuffling a deck of cards can produce a result with odds of 1 in 10^67, then think of the probability of something uncommon: what's the probability of the particular arrangement of molecules in a particle of dust? Or the shape of a comets tail? Those are all natural, possible events - and the probability of them happening the way that they do is pretty much beyond our capacity to imagine.

When you grasp that, then you start to realize just how meaningless the big numbers argument is. No matter how improbable something is, that does not mean that it is impossible. Impossible is a probability of 0. And any probability, no matter how small, is almost certainly the probability of something that's occurring in the universe, somewhere, right now.

Repeat after me: Improbable IS NOT Impossible

I got a request to respond to look at this. It's yet another one of those innumerate jackasses arguing that god must exist, because life is too unlikely otherwise.

It's the same old crap, but with one interesting twist - this guy tries to provide an argument for why anything less probable than a specific level of improbability is absolutely physically impossible. In his own words:
In a previous post, I presented a logical, factual argument for the existence of God. But one of my readers, Seanny McShawn, took issue with my syllogisms, saying that I was simply making an argument “from incredulity”.

When someone tries to win an argument based on simple probabilities, this is called an “argument from incredulity.” This is a logical fallacy. In other words, the sheer unlikeliness of a scenario does not preclude its possibility and cannot be proof against it.

But was I arguing from “incredulity”?
The answer is, pretty much, "yes".
Physicists estimate that in our universe there are 10^80 particles. Mathematicians say that the mathematical level of absolute impossibility is 1 chance in 10^50.
Not any competent mathematician.
However, the physical level of absolute impossibility is 1 chance in 10^80, and here’s why:

On the basic level, probability is defined by the ‘particle’ example: finding a specially marked particle among 500,000 particles is beating odds of 1 in 500,000. In a universe that has 10^80 individual particles, the most improbable scenario is finding a specially marked particle in the entire universe. Due to the size of our universe, it is impossible to have a more improbable set of odds than 1 chance in 10^80. Anything that is more improbable than the most improbable is by all standards absolutely impossible.
I have never seen this particle example nonsense. But nonsense is what it is. It is, in fact, astoundingly easy to demonstrate.

Ready? Are you sure? Ok. Here goes:
adfjiopwqeruoipboisdupoqwerbjksggfiuqwerup9ujdknc
blkjdbfqlijwecna34kjvldjfghqiopewufnadsmvbviuhwro
wprotiudvncmbcxbnajkhgpqrjwopeporewgjvdlkfmfkpowe
ri3nwe9mfwkdgnguipewiofkxnvcvpuiweklhhjnjwrn9fpuo
isdhpvjpndjkgnewqhtrepiuwifjoxcjvkneqotiurhwepafk
makdlgnqwejrhpqfioudshnfeiqjhweiufh
That's 286 random characters, produced by pounding my hands on my keyboard, and then deleting any punctuation characters (because I'm too lazy to figure out how many unshifted characters there are on my keyboard, so I stuck to numbers and letters.) The odds of generating that particular string are 36^286. That's quote a bit less likely than 10^80. But guess what? It's possible!

Or, just to annoy the fundies, go get yourself a deck of Tarot cards. (Hey, they're good to have. There are some damned neat poker variants that you play with a Tarot deck, not to mention some funky solitaires.) A tarot deck has 78 cards. Shuffle the deck. Lay the cards out in order. The odds of that ordering? Less than 1 in 10^115. But entirely possible.

What is the probability of a solar flare taking a specific shape? Think of the trillions of trillions of particles in a flare, and the unique shape of each one. The probability of those shapes? Quite a bit worse than 1 in 10^80.

This one in 10^80 stuff is pure nonsense. It's just another attempt to create an arbitrary threshold beyond which we can say something is impossible.

He then goes on to repeat yet another variant of the "odds of life based on the probability of a particular protein" gibberish. I've covered that plenty of times on this blog; for example, my analysis of Berlinski's version of this argument here.

So this guy concludes with:
This is not an argument from incredulity. This is an argument from facts: cold, hard facts. Since any set of odds above 1 in 10^80 is absolutely impossible, random chance could not and did not produce life.
Except that it is an argument from incredulity, utterly lacking in cold, hard facts. In fact, it's trivially refuted by cold hard facts; and it's based on one of the most pathetic attempts I've seen so far to define "impossible" in terms of probability.

Magic 23.5 update: Osborne runs away.

Just thought I'd put up a quick little note here. I found that the author of the "23.5" stuff, after giving up on responding here, has set up a nice little website - sans comments - in which he defends himself from the criticisms here.

Since I went to the trouble of publically presenting his defense on the front page of this blog, and engaging in a civil discussion with him in the comments, I'm more than a little bit ticked at this. He's obviously welcome to handle things as he wants, but I find it rather cowardly of Gary to silently drop out of a public, open discussion here, and then continue it on a private site, where no one who disagrees with him is permitted to respond.

I'm not going to link to it, since he's got rather a lot of advertisements set up (you actually have to sit through an ad on the main link to the site before you're allowed to read any content), and I'd rather not send cash in his direction as a reward for this. If you really care, you can find it either through google, or through the sitemeter referrals link on the bottom of the front page of this blog.

Practical Applications of Good Math: Type Checking in Programming Languages

(Since I see that there are still links pointing at this post, I'll point out here that this blog has moved to scienceblogs. This discussion was continued in a post there.)

I thought today I'd go a bit off topic, and rant a little bit about stuff that's been bugging me lately in my real work. It does, in its bizzare way, relate to the lambda calculus series I've been writing lately.

What I do professionally is research in a field called collaborative software development: how to build tools that help groups of people work together to build large systems. It's an interesting thing to do, with a huge range of things that can be done, ranging from better programming languages to better configuration management systems to better programming environments to entirely new kinds of collaborative tools. It involves practical programming, math, psychology and human factors, and a few other interesting things.

My approach to this kind of work is to view the key problem as being one of communication. Most problems in large systems are ultimately not algorithmic errors - they're errors where some key piece of information was not correctly communicated between developers. For example, one of the most common bugs in large systems occurs when the implementor of a piece of code makes assumptions about how their code is going to be called; and the developer who uses that code makes different assumptions. To be concrete; I experienced this recently on my project at work. I'm working on a bit of code, and one function I call takes a lot of parameters, each of which is an array of objects. Most of the time, you only supply a real value for one or two out of a list of 8 parameters. I assumed that for unused parameters, it made sense to pass null - that's how I would have written that method. The person who wrote it expected callers to never pass null, but to use empty arrays for unused parameters. Program compiles, deploys, and boom: exceptions out the wazoo. (There was a dynamic check for null, but because of some proxy stuff we do, it was very hard to track it down.)

This isn't a result of incompetence on anyone's part. This project has some of the finest developers that I've ever met working on it. The person who wrote that method with the awful parameter list is actually a pretty great programmer - one of the top people from one of the top groups of developers that I know of. I don't think I'm incompetent either :-). So why did the error happen?

Because a key piece of information didn't get passed between us. Between the size of our system, the number of people involved, the way the code is partitioned into components, and where the documentation lives, the information about what to do about empty parameters got lost.

What does any of this have to do with lambda calculus? Types.

That problem wouldn't have happened if we were programming in my favorite programming language, OCaml. In OCaml, to be able to pass a "null" value, I actually need to specify that in the parameter declaration (as an Option type). There's no way I can pass a null to a function that wasn't expecting one.

It's a common assertion among a lot of the elite hackers out there that "real programmers don't use strongly typed languages", or the Paul Graham line about strongly typed languages. For example, here's Paul on Java:
It's designed for large organizations. Large organizations have different aims from hackers. They want languages that are (believed to be) suitable for use by large teams of mediocre programmers-- languages with features that, like the speed limiters in U-Haul trucks, prevent fools from doing too much damage.
This attitude really bugs me. The thing is, types are meta-information about a program - and useful meta-information. And the more people you have involved in a system, the more important it becomes to have strong, formal information about parts of the system that allow automatic checks.
  • The elitism issue. "Yeah, I'm a super-duper programmer who doesn't make mistakes; people who don't work the way I do are just inferior." Yeah, tell that to the guys who wrote OCaml.
  • The cluelessness issue: "I write programs in Perl/Python/CommonLisp/... with three other guys who are all lisp wizards, and if we don't have a problem, then no one should, so that typing stuff is just nonsense to cover up for incompetence." There are problems that emerge as the size of a system and the number of people working on it increase. Three people hacking on something is very different from 30 people hacking on it, which is very different from 300 people. As the number of people involved increases, you need to do more to ensure that information gets to where it's needed.
  • The information loss issue. This is the one that actually bugs me the most. Programmers know what types they're using. They know what types they're passing. They know information about the parameters they're taking, and the parameters that they're passing. Good type system give them a way to write that down - and not just write it, but make it a part of the code, so that it can be tested. Writing it down as a part of the code means that the information is recorded; that it's maintained; and that it's checked. Not declaring types saves you a bit of keyboard time - but it means that there's important information in your head which you're not writing down. (I have exactly the same rant about parallelizing compilers, which is that you have to structure your code very carefully and precisely to allow the compiler to figure out the information that you knew before you wrote the code; but instead of letting you write it down, you have to trick the compiler into figuring it out on its own.)
The most common response from people like Paul to rants like mine is "Strong testing, not strong typing." The idea being, type checking is really something like a very simple kind of compile time testing, which is inferior to a real, complete test suite.

Sure, strong typing in a language is no replacement for testing. But why on earth should I write tests to verify correct typing, when I can just put the type information in line in the code and have it tested automatically? Typing and type checking is one part of the process; having a typed language doesn't mean you shouldn't write tests; writing tests doesn't mean that compile-time type checking is useless. Pulling the typing information out of line into a test is, frankly, just dumb: the more you spread the information around between multiple places, the less likely it is to be maintained, or even found.

I'm a huge unit testing fan; but one of the major limitations of testing is information scattering. You're putting information about the invariants and requirements of your code in the logic of the tests - and because of the multidimensional nature of software systems, that inevitably means that some of the information is here, some there. Trying to assemble the information you need from a list of test failures can be incredibly frustrating; and those tests often don't get maintained, so things change, but the tests aren't changed to thoroughly respect the new semantics. Compile time checking can't get out of synch with the current state of the code.

Tuesday, May 30, 2006

Types in Lambda Calculus

Now that we've got intuitionistic logic under our belts, we're ready to get back to lambda calculus: we've got the logical tools we need to define the model. Of course, nothing is ever quite that simple, is it?

What we've talked about so far is the simple untyped lambda calculus. That's how LC started; it's the first version that Church invented. But it did end up with some problems, and to get around that, a concept of types was introduced, leading to the simple typed lambda calculus, and on from there to a world of variants - SystemT, SystemF, the Lambda Cube (not related to the time cube :-)), and so one. And eventually, people realized that the untyped lambda calculus was really just a simple pathologically simple case of a typed lambda calculus - it's a typed LC with only one type.

The semantics of lambda calculus are easiest to talk about in a typed version. For now, I'll talk about the simplest typed LC, known as the simply typed lambda calculus; and how it's ultimately semantically equivalent to an intuitionistic logic. (In fact, every version of typed LC corresponds to an IL; and every beta reduction in an LC corresponds to an inference step in an IL proof; that's why we needed to sidetrack into intuitionistic logic to get here.)

The main thing that typed lambda calculus adds to the mix is a concept called base types. In a typed lambda calculus, you have some universe of atomic values which you can manipulate; those values are partitioned into simple types. Base types are usually named by single lower-case greek letters; since that's a pain on Blogger, I'll use upper-case english letters for type names. So, for example, we could have a type "N", which consists of the set of natural numbers; a type "B" which corresponds to boolean true/false values; and a type "S" which corresponds to strings.

Once we have basic types, then we can talk about the type of a function. A function maps from values of one type (the type of parameter) to values of a second type (the type of the return value). For a function that takes a parameter of type "A", and returns a value of type "B", we write its type as "A -> B". "->" is called the function type constructor; it associates to the right, so "A -> B -> C" means "A -> (B -> C)".

To apply types to the lambda calculus, we do a couple of things. First, we need a syntax update so that we can include type information in lambda terms. And second, we need to add a set of rules to show what it means for a typed program to be valid.

The syntax part is easy. We add a ":" to the notation; the colon has an expression or variable binding on its left, and a type specification on its right. It asserts that whatever is on the left side of the colon has the type specified on the right side. A few examples:
  1. lambda x : N . x + 3. This asserts that the parameter, x, has type "N", which is the natural numbers. There is no assertion of the type of the result of the function; but since we know that "+" is a function with type "N -> N", which can infer that the result type of this function will be "N".
  2. (lambda x . x + 3) : N -> N. This is the same as the previous, but with the type declaration moved out, so that it asserts the type for the lambda expression as a whole. This time we can infer that "x : N" because the function type is "N -> N", which means that the function parameter has type "N".
  3. lambda x : N, y : B . if y then x * x else x. A two parameter function; the first parameter is type "N", the second type "B". We can infer the return type, which is "N". So the type of the full function is "N -> B -> N". This may seem surprising at first; but remember that lambda calculus really works in terms of single parameter functions; multi-parameter functions are a shorthand for currying. So really, this function is: lambda x : N . (lambda y : B . if y then x * x else x); the inner lambda is type "B -> N"; the outer lambda is type "N -> (B -> N)".
To talk about whether a program is valid with respect to types (aka well-typed), we need to introduce a set of rules for type inference. When the type of an expression is inferred using one of these rules, we call that a type judgements. Type inference and judgements allow us to reason about types in a lambda expression; and if any part of an expression winds up with an inconsistent type judgement, then the expression is invalid. (When Church started doing typed LC, one of the motivations was to distinguish between values representing "atoms", and values representing "predicates"; he was trying to avoid the Godel-esque paradoxes, by using types to ensure that predicates couldn't operate on predicates.)

I'm going to have to adopt a slightly unorthodox notation for type judgements; the standard notation is just too hard to render correctly with the software I'm using currently. The usual notation is something that looks like a fraction; the numerator consists of statements that we know to be true; the denominator is what we can infer from those. In the numerator, we normally have statements using a context, which is a set of type judgements that we already know; it's usually written as an uppercase greek letter. For type contexts, I'll use the names of greek letters written in uppercase letters. If a type context includes the statement that "x : A", I'll write that as "CONTEXT |- x : A". For the fraction inference notations, I'll use a two lines labelled "Given:" and "Infer:". (You can see what the normal notation looks like by visiting wikipedia's STLC page.)

Rule1: (Type Identity)
Given: nothing
Infer: x : A |- x : A

The simplest rule: if we have no other information except a declaration of the type of a variable, then we know that that variable has the type it was declared with.

Rule2: (Type Invariance)
Given: GAMMA |- x : A, x != y
Infer: (GAMMA + y : B) |- x : A

This is a statement of non-interference. If we know that "x : A", then inferring a type judgement about any other term cannot change our type judgement for "x".

Rule3: (Parameter to function inference)
Given: (GAMMA + x : A) |- y : B
Infer: GAMMA |- (lambda x : A . y) : A -> B

This statement allows us to infer function types: if we know the type of the parameter to a function is "A"; and we know that the type of the value returned by the function is "B", then we know that the type of the function is "A -> B".

And finally, Rule4: (Function application inference)
Given: GAMMA |- x : A -> B, Gamma |- y : A
Infer: GAMMA |- (x y) : B

If we know that a function has type "A -> B", and we apply it to a value of type "A", the result is an expression of type "B".

These four rules are it. If we can take a lambda expression, and come up with a consistent set of type judgements for every term in the expression, then the expression is well-typed. If not, then the expression is invalid.

Just for kicks, here's how we'd write types for the SKI combinators. These are incomplete types - I'm using type variables, rather than specific types. In a real program using the combinators, you could work out the necessary substitutions of concrete types for the type variables. Don't worry, I'll show you an example to clarify that.
  • I combinator: (lambda x . x) : A -> A
  • K combinator:: (lambda x : A . ((lambda y : B . x) : B -> A)): A -> B -> A
  • S combinator: (lambda x : A -> B-> C . (lambda y : A -> B . (lambda z : A . (xz : B -> C) (yz : B)))) : (A -> B -> C) -> (A -> B) -> C
Now, let's look at a simple lambda calculus expression: lambda x y . y x. Without any type declarations or parameters, we don't know the exact type. But we do know that "x" has some type; we'll call that "A"; and we know that "y" is a function that will be applied with "x" as a parameter, so it's got parameter type A, but its result type is unknown. So using type variables, we can say "x : A, y : A -> B". We can figure out what "A" and "B" are by looking at a complete expression. So, let's work out the typing of it with x="3", and y="lambda a : N. a*a". We'll assume that our type context already includes "*" as a function of type "N -> N -> N".
  • (lambda x y . y x) 3 (lambda a : N . a * a)
  • Since 3 is a literal integer, we know its type: 3 : N.
  • By rule 4, we can infer that the type of the expression "a * a" where "a : N" is "N"; (* : N -> N -> N; it's applied to an N and an N) and therefore, by rule 3 the lambda expression has type "N -> N". So with type labelling, our expression is now: (lambda x y . y x) (3 : N) (lambda a : N . (a * a):N) : N -> N
  • So - now, we know that the parameter "x" of the first lambda must be "N"; and "y" must be "N -> N"; so by rule 4, we know that the type of the application expression "y x" must be "N"; and then by rule 3, the lambda has type: N -> (N -> N) -> N.
  • So, for this one, both A and B end up being "N".
So, now we have a simply typed lambda calculus. The reason that it's simply typed is because the type treatment here is minimal: the only way of building new types is through the unavoidable "->" constructor. Other typed lambda calculi include the ability to define parametric types, which are types expressed as functions ranging over types.

Monday, May 29, 2006

PEAR yet again: the theory behind paranormal gibberish

I've been looking at PEAR again. I know it may seem sort of like beating a dead horse, but PEAR is, I think, something special in its way: it's a group of people who pretend to use science and mathematics in order to support all sorts of altie-woo gibberish. This makes them, to me, particularly important targets for skeptics: if they were legit, and they were getting the kinds of results that they present, they'd be demonstrating something fascinating and important. But they're not: they're trying to use the appearance of science to undermine science. And they're incredibly popular among various kinds of crackpottery: what led me back to them this time is the fact that I found them cited as a supporting reference in numerous places:
  1. Two different "UFOlogy" websites;
  2. Eric Julien's dream-prophecy of a disastrous comet impact on earth (which was supposed to have happened last thursday);
  3. Three different websites where psychics take money in exchange for psychic predictions or psychic healing;
  4. Two homeopathy information sites;
  5. The house of thoth, a general clearinghouse site for everything wacky.
Anyway, while looking at the stuff that all of these wacko sites cited from PEAR, I came across some PEAR work which isn't just a rehash of the random number generator nonsense, but instead an attempt to define, in mathematical terms, what "paranormal" events are, and what they mean.

It's quite different from their other junk; and it's a really great example of one of the common ways that pseudo-scientists misuse math. The paper is called "M* : Vector Representation of the Subliminal Seed Regime of M5", and you can find it here.

The abstract gives you a pretty good idea of what's coming:
A supplement to the M^5 model of mind/matter interactions is proposed wherein the subliminal seed space that undergirds tangible reality and conscious experience is characterized by an array of complex vectors whose components embody the pre-objective and pre-subjective aspects of their interactions. Elementary algebraic arguments then predict that the degree of anomalous correlation between the emergent conscious experiences and the corresponding tangible events depends only on the alignment of these interacting vectors, i. e., on the correspondence of the ratios of their individual ‘‘hard’’ and ‘‘soft’’ coordinates. This in turn suggests a subconscious alignment strategy based on strong need, desire, or shared purpose that is consistent with empirical experience. More sophisticated versions of the model could readily be pursued, but the essence of the correlation process seems rudimentary.
So, strip out the obfuscatory babble, what does this actually say?

Umm... "babble babble complex vectors babble babble babble algebra babble babble ratios babble babble correlation babble babble." Seriously: that's a pretty good paraphrase. That entire paragraph is meaningless. It's a bunch of nonsense mixed in with a couple of pseudo-mathematical terms in order to make it sound scientific. There is no actual content to that abstract. It reads like a computer-generated paper from SCIgen.

(For contrast, here's a SCIgen-generated abstract: "The simulation of randomized algorithms has deployed model checking, and current trends suggest that the evaluation of SMPs will soon emerge. In fact, few statisticians would disagree with the refinement of Byzantine fault tolerance. We confirm that although multicast systems [16] can be made homogeneous, omniscient, and autonomous, the acclaimed low-energy algorithm for the improvement of DHCP [34] is recursively enumerable.")

Ok, so the abstract is the pits. To be honest, a lot of decent technical papers have really lousy abstracts. So let's dive in, and look at the actual body of the paper, and see if it improves at all.

They start by trying to explain just what their basic conceptual model is. According to the authors, the world is fundamentally built on consciousness; and that most events start in a a pre-conscious realm of ideas called the "seed region"; and that as they emerge from the seed region into experienced reality, they manifest in two different ways; as "events" in the material domain, and as "experiences" or "perceptions" in the mental domain. They then claim that in order for something from the "seed region" to manifest, it requires an interaction of at least two seeds.

Now, they try to start using pseudo-math to justify their gibberish. I'm going to modify the notation slightly to make it readable without using any symbols that are problematic on blogger; you can check the paper to see that I'm not changing anything but trivial notation.

Suppose we have two of these seed beasties, S_1, and S_2. Now, suppose we have a mathematical representation of them as "vectors". We'll represent that as a function, rep(S).

A "normal" event, according to them, is one where the events combine in what they call a "linear" way (scare-quotes theirs): rep(S_1) + rep(S_2) = rep(S_1 + S_2). On the other hand, events that are perceived as anomalous are events for which that's not true: rep(S_1) + rep(S_2) != rep(S_1 + S_2).

We're already well into the land of pretend mathematics here. We have two non-quantifiable "seeds"; but we can add them together... We're pulling group-theory type concepts and notations, and applying them to things that absolutely do not have any of the prerequisites for those concepts to be meaningful.

But let's skip past that for a moment, because it gets infinitely sillier shortly.

They draw a cartesian graph with four quadrants, and label them (going clockwise from the first quadrant): T (for tangible), I (for intangible - aka, not observable in tangible reality), U (for unconscious), and C (conscious). So the upper-half is what they consider to be observable, and the bottom half is non-observable; and the left side is mind and the right side is matter. Further, they have a notion of "hard" and "soft"; objective is hard, and subjective is soft. They proceed to give a list of ridiculous pairs of words which they claim are different ways of expressing the fundamental "hard/soft" distinction, including "masculine/feminine", "particulate/wavelike", "words/music", and "yang/yin".

Once they've gotten here, they get to my all-time favorite PEAR statement; one which is actually astonishingly obvious about what they're really up to:
It is then presumed that if we appropriate and pursue some established mathematical formalism for representing such components and their interactions, the analytical results may retain some metaphoric relevance for the emergence of anomalous mind/matter manifestations.
I love the amount of hedging involved in that sentence! And the admission that they're just "appropriating" a mathematical formalism for no other purpose than to "retain some metaphoric relevance". I think that an honest translation of that sentence into non-obfuscatory english is: "If we wrap this all up in mathematical symbols, we can make it look as if this might be real science".

So, they then proceed to say that they can represent the seeds as complex numbers: S = s + i(sigma). But "s" and "sigma" can't just be simply "pre-material" and "pre-mental", because that would be too simple. Instead, they're "hard" and "soft"; even thought we've just gone through the definition which categorized hard/soft as a better characterization of material and mental. Oh, and they have to make sure that this looks sufficiently mathematical, so instead of just saying that it's a complex, they present it in both rectangular and polar coordinates, with the equation for converting between the two notations written out inside the same definition area. No good reason for that, other than have something more impressive looking.

Then they want to define how these "seeds" can propagate up from the very lowest reaches of their non-observable region into actual observable events, and for no particular reason, they decide to use the conjugate product equation randomly selected from quantum physics. So they take a random pair of seeds (remember that they claim that events proceed from a combination of at least two seeds), and add them up. They claim that the combined seed is just the normal vector addition (which they proceed to expand in the most complex looking way possible); and they also take the "conjugate products" and add them up (again in the most verbose and obfuscatory way possible); and then take the different between the two different sums. At this point, they reveal that for some reason, they think that the simple vector addition corresponds to "rep(S_1) + rep(S_2)" from earlier; and the conjugate is "rep(S_1+S_2)". No reason for this correspondence is give; no reason for why these should be equal for "non-anomalous" events; it's just obviously the right thing to do according to them. And then, of course, they repeat the whole thing in polar notation.

It just keeps going like this: randomly pulling equations out of a hat for no particular reason, using them in bizzarely verbose and drawn out forms, repeating things in different ways for no reason. After babbling onwards about these sums, they say that "Also to be questioned is whether other interaction recipes beyond the simple addition S_12 = S_1 + S_2 could profitably be explored."; they suggest multiplication; but decide against it just because it doesn't produce the results that they want. Seriously. In their words "but we show that this doesn't generate similar non-linearities": that is, they want to see "non-linearities" in the randomly assembled equations, and since multiplying doesn't have that, it's no good to them.

Finally, we're winding down and getting to the end: the "summary". (I was taught that when you write a technical paper, the summary or conclusion section should be short and sweet. For them, it's two full pages of tight text.) They proceed to restate things, complete with repeating the gibberish equations in yet another, slightly different form. And then they really piss me off. Statement six of their summary says "Elementary complex algebra then predicts babble babble babble". Elementary complex algebra "predicts" no such thing. There is no real algebra here, and nothing about algebra would remotely suggest anything like what they're claiming. It's just that this is a key step in their reasoning chain, and they absolutely cannot support it in any meaningful way. So they mask it up in pseudo-mathematical babble, and claim that the mathematics provides the link that they want, even though it doesn't. They're trying to use the credibility and robustness of mathematics to keep their nonsense above water, even though there's nothing remotely mathematical about it.

They keep going with the nonsense math: they claim that the key to larger anomalous effects resides in "better alignment" of the interacting seed vectors (because the closer the two vectors are, in their framework, the larger the discrepancy between their two ways of "adding" vectors); and that alignments are driven by "personal need or desire". And it goes downhill from there.

This is really wretched stuff. To me, it's definitely the most offensive of the PEAR papers. The other PEAR stuff I've seen is abused statistics from experiments. This is much more fundamental - instead of just using sampling errors to support their outcome (which is, potentially, explainable as incompetence on the part of the researchers), this is clear, deliberate, and fundamental misuse of mathematics in order to lend credibility to nonsense.

Friday, May 26, 2006

Finally: the Kripke Model for Intuitionistic Logic

As promised, today, I'm going to show the Kripke semantics model for intuitionistic logic.

Remember from yesterday that a Kripke model has three parts: (W, R, ||-), where W is a set of worlds; R is a transition relation between worlds; and "||-" is a forces relation that defines what's true in a particular world.

To model intuitionistic logic, we need to add some required properties to the transition and forcing relations. With these additional properties, we call the transition relation "<="; "A <= B" means that B is accessible from A. For an intuitionistic Kripke model, the transition relation "<=" must be both reflexive (A <= A) and transitive (if A <= B and B <= C, then A <= C), making it a pre-order relation - that is, a relation that has the essential properties of "<=" that is necessary to define a partial order over its domain. The forces relation, "||-" also needs some additional conditions: (these conditions are based on the definitions from the wikipedia article on Kripke models; I can't remember them without help, so I used the article as a reference.)
  1. for all w,u in W such that w <= u, for all statements s if w ||- p, then u ||- p. This condition basically says that if a statement is true/valid in a particular world w, then it must be true in all worlds accessible from w - meaning that in intuitionistic logic, once a statement is proven, it can't be taken back.
  2. for all w in W, w ||- A and B if/f w ||- and w ||- B. "A and B" is provable in w only if both A and B are provable.
  3. for all w in W, w ||- A or B if/f w ||- A or w ||- B. "A or B" is provable in w only if either A or B is probable in w.
  4. w ||- A implies B if/f for all u such that w <= u, (u ||- A) implies (u ||- B). Being able to prove that "A implies B" in w means that in every world accessible from w, if you can prove A, then you can prove B.
  5. There is no contradictory statement (written _|_) such that w ||- _|_.
The last line in there is actually a bit of a cheat: you can think of intuitionistic logic as a logic where at a given point of time, a statement can have three truth values: T (top), which means provably true; _|_ (bottom) which means provably false, and unproved. The last property of the forces relationship means that under no conditions can the "forces" relation say that a proposition is true if it is provably false.

So: suppose we have an intuitionistic logic consisting of a set of statements, which we'll call X. The Kripke model for X is the triple (W, <=, M), where W is the set of worlds related by a "<=" relation that has the intuitionistic properties that we just discussed. M is a set of sets: for all w in W: exists one M_w in M, where M_w is an L-structure for X. (An L-structure is a formal way of defining what statements are proved true - and for statements with variables, the bindings of those variables to values for which the statement has been proved true. For the purposes of understanding, you can think of M_w as a function that takes a logical statement, and a list of values for the variables in that statement, and returns "true" if the statement is proved true, and false if the statement is proved false; and the statement is not in the domain of the function if it's truth value hasn't been proved.) The M_w's for intuitionistic logic have a set of properties that come from the meaning of statements in the logic:
  1. if u <= v, then domain(M_u) subset domain(M_v)
  2. For any function f, in the logic, for all a in domain(f) in M_u, f(a) in M_u = f(a) = M_v.
  3. For any predicate P(a_1,a_2, ..., a_n), if P(a_1,a_2, ..., a_n) is true in M_u, then it is true in M_v.
Now, finally, we can get to the last step in defining our model of intuitionistic logic: the forces relation based on M. What we need to do to make M work as a forces relation is to separate it into two parts: the truth/validity of concrete statements (that is, statements with all variables bound to specific values), which we write using the old forces notation, ||-; and the mapping of variables and functions to values. We call that mapping an evaluation function; for a statement A in X, we write A[e] for A with variables bound by e.

So: given an evaluation e of variables from M_w, w ||- A[e] is defined by:
  1. w ||- P(a_1,...a_n)[e] if/f P(a_1[e],...,a_n[e]) is in M_w: Evaluating a predicate means the same thing as keeping the predicate unchanged, and evaluating each of its parameters separately.
  2. w ||- (A and B)[e] if/f w ||- A[e] and w ||- B[e]: a typical and definition.
  3. w ||- (A or B)[e] if/f w ||- A[e] or w ||- B[e]: a typical or definition.
  4. w ||- (A implies B)[e] if/f forall w <= u: u ||- A[e] implies u ||- B[e]: an implication is true if, in the model, you can prove its constituents in the next step. (This is the opposite of its definition in the logic: in logic, we'd say that given A implies B and A, you can conclude B. In the model, we go the other way, and say that you can conclude that A implies B if, for all possible transitions, the next world allows you to conclude either A or not B.)
  5. w NOT ||- _|_: no w can force a contradiction.
  6. w ||- (exists x : A)[e] if/f exists a in M_w such that w ||- A[e], and e maps x to a. An exists statement is provable only if we can show a concrete, specific value for which the statement is true.
  7. w ||- (forall x : A)[e] if/f forall u such that w <= u: forall a in M_u: u ||- A[e] where e maps x to A: a universal statement is provable if every possible next world can prove that the statement with all possible bindings of variables are true.
What we've done now is to define a way of describing the meaning of any statement in intuitionistic logic, and any reasoning step in intuitionistic logic in terms of sets, in a way where if we do the translation of the logical statement down to the sets, the statements about the sets will always be true and correct; and if we reason from the sets that correspond to the logic, anything we can do will produce a well-defined, meaningful, and valid statement in the logic.

The particularly nice thing about an intuitionistic logic with a Kripke model is that for any statement which is provably true, we can use the model to find specific values for the variables in the statement for which its true; and if a statement is false, we can find specific values for the variables in the statement that produce a concrete counterexample. We can't always do that in regular predicate logic: regular predicate logic allows us to do inferences from things like "exists" statements if we know that some value must exist for which the statement is true, even if there's no way for us to figure out what that value is. Likewise, in regular predicate logic, we can infer that a forall statement is false if we know that there must be some value for which it's false; in intuitionist logic, we need to be able to show a specific example for which it's false.

Friday Random Ten, 5/26

  1. Lunasa, "Punch". Trad irish, by the best in the business.
  2. Dirty Three, "In fall". The Dirty Three is one of those "post-rock ensembles" which have me seriously hooked. Very, very cool stuff, which kind of defies description. Just get some and listen to it!
  3. The Flower Kings, "Monsters and Men". A 20 minute long track of the newest FK release. As usual for the Flower Kings, it's bloody brilliant.
  4. Peter Hammill, "The Wave". Peter Hammill is one of the original progressive rock guys, the founder of "Van Der Graff Generator". This is off of a live solo CD with Peter doing vocals and keyboards, accompanied by an electric violin. That's it - no drums, no guitars, just voice, keyboard, and violin.
  5. Glass Hammer, "Run Lissette". Glass Hammer is a recent discovery of mine; they're an American neo-progressive band. They sound something like a cross between Yes and old Genesis, with a bit of Flower Kings for good measure. They're a really great band.
  6. Porcupine Tree, "Sleep of No Dreaming".
  7. John Corigliano, "Etude Fantasy 5: Melody" from "Phantasmagoria". Corigliano is one of the finest composers working today, and this is a great piece of his work. Not quite up there with his clarinet concerto, which is one of my very favorite modern compositions, but with stuff this good, "not as good as his best" is also "100 times better than anything else".
  8. ProjeKct Two: "Sector Drift [The Planet Zarg Quartet Pt. 6]". Goofy free improv from one of the King Crimson projects - Fripp, Belew, and Gunn, with Belew playing a percussion synthesizer instead of a guitar. Surprisingly good considering that Belew had no experience with the instrument before recording this, live.
  9. Tempest, "The Barrow Man". An neoprogressive electric folk band. These guys are interesting. The leader of the band is Swedish; they play a blend of Swedish and Celtic based melodies, in a progressive rock style. Sort of Fairport Convention on steroids, led by an electric mandolin.
  10. Whirligig, "Through The Bitter Frost And Snow". A really beautiful ballad from a NYC based trad Irish band. There's also an Irish trad Irish band called Whirligig, which has nothing to do with the NYC band - one of those unfortunate collisions that tends to happen with very small-label bands.

Thursday, May 25, 2006

Magic-23 update: the author responds

I received a response to my post Magic-23 from the author of the article I was mocking. As I did earlier with Berklinski, I think that if the author is willing to take the time to respond to me, it's only fair to let their response be seen by readers. Mr. Osborne took the time to write a very civil response, and so I'm going to post it here, with my own comments at the end. His message is unedited, and quoted verbatim:
Hi Mark,

I read your essay 'Good Math, Bad Math', which criticizes my work on the 23.5-degree references and makes it all look rather silly, which is disappointing really.

I'm not really that interested in the '23' number phenomenon, neither am I "in on a joke", as one of your correspondents states, and neither am I out to exploit anything and make a quick buck.

This is a real phenomenon. I have spent years researching this, and many others have verified my findings.
I think my presentation is well balanced and is merely based on what I have discovered despite my own beliefs, if any.
Someone has to discover and bring attention to this. Surely it was intended that someone should.

Attached is a chapter from my book. This is what all these references are leading to and more.

Also, viewing the images would help. See here, another version.

http://www.freewebs.com/garyosborn/235degrees.htm


Also take a look at this version of the 23.5 Degree presentation on my website.

http://garyosborn.moonfruit.com/revelations

Also I don't have a book entitled 'Revelations'.

So at the end of the day, I am merely bringing attention to the things I have discovered and I don't see that as wrong, and I don't think my findings are "chock full of insanity" either.

Don't want to lecture you Mark, but first and foremost. I'm always careful not to believe in something that will stop me finding other alternatives, and most of the time, and if I search long enough within my own mind, I find that I really don't believe in anything 100%.
I always try to remain neutral and balanced.
We are all prone to experiencing, seeing and believing whatever it is we are focusing on at the time and really because we are creating everything ourselves, and many of us don't realise it.

Read my article The Synchroholoform on my website. There's a "bigger picture" to all this and it appears to be related to the creative nature of our own consciousness and that's something I always consider.

Oh and I too like The Flower Kings and go to see them whenever the opportunity arises - so that's something we have in common.

Kind Regards,
Gary Osborn
My response is going to be brief.

Every crackpot in the world, upon being confronted by a debunker, invariably comes back with some kind of comment about having an open mind. Having an open mind is important - but so is being skeptical. There's a lot of gibberish out there, and there are a lot of people who are either wrong, crazy, or deliberately deceptive: you should always be open to new ideas, but at the same time, you have to look at them very carefully. The reason that I think the 23.5 degree stuff is so goofy isn't because I don't have an open mind - it's because it doesn't meet the test of credibility.

For it to be believable, I would need to accept:

  • That you can, looking at a painting, distinguish between the
    "perspective angle" of 22.5 degrees, and the axis angle of 23-23.5 degrees.
  • That painters throughout history could both distinguish and paint angles so accurately that they could deliberately make a line between 1/2 and 1 degree different from the standard perspective angle.
  • That there was such a threat from the church that this knowledge had to be kept secret, but at the same time, it was so widespread that there are dozens of different painters and architects who all not only knew about it, but visibly inserted it in their work.
I don't buy that, at all.

But it gets worse: to get to your conclusion about a catastrophe that pushed the earth onto its tilted axis, I would have to accept:
  • Our understanding of gravity - which tests out accurately to a incredible degree - is entirely wrong. We can plan space probe trajectories that do double gravitational slingshots and wind up in exactly the right position - travelling millions of miles, and yet reaching a very precise destination with near-perfect accuracy.
  • The stability of the solar system is an illusion. The fact that we have a uniform ecliptic plane, with a regular pattern of near-circular orbits is something close to impossible to explain mathematically if we assume that there was a catastrophic change in the solar system during the period of human history.
  • Something drastic enough to tilt the earth on its axis and reconfigure the entire solar system occurred, and yet life on earth was pretty much unaffected; and the event left no geological evidence at all.
  • A catastrophic change occurred recently enough that people like the Egyptians were able to record it; and yet there are no explicit records of the event - not from the Egyptians, the Babylonians, the Chinese - none of them recorded it in anything like explicit terms - all just used highly ambiguous metaphorical allegories.
Nope. No way this is credible - physics, mathematics, geology, and history all argue against it.

Moving towards models: Kripke Semantics

To get the formal semantics of intuitionistic logic (and thus lambda calculus), we need a tool called Kripke Semantics. Kripke semantics is an amazing tool, which made it possible to define a lot of things in logics that weren't approachable before. In this post, I'm going to explain the basic ideas of Kripke semantics and how they're used for building models of logics; in my next post, I'll describe the Kripke semantics model for intuitionistic logic.

Modality

The basic idea of Kripe semantics is to associate the truth or validity of logical statements with something like a concept of time called a modality. In a modality, you have a set of states called worlds, with a transition relationship between them called accessibility. A world w assigns truth values to logical statements; and the accessibility relation defines what worlds can be reached from w. (For notation, we usually call the accessibility relation R, and write R(a,b) to mean that b is accessible from a.)

With modality, you have a new property in your logic: the truth value of statements can change when you transition from one world to another. To be able to talk about this, Kripke models have two modal operators that behave sort of like quantifiers over worlds.
  1. L (also written as a square) for "necessarily": LP means that P is true in every possible world;
  2. M (also written as a diamond) for "possible": MP means that P might be true in some world.
L and M are duals, very similar to a temporal version of "forall" and "exists". Just like "forall x : P(x)" is equivalent to "not exists x : not P(x)", Mp = not L not p.

Kripke Models

The semantics of Kripke models comes from the ability to make truth assignments; and truth assignments in turn come from something called a forcing relation. A world "w" in a Kripke semantics model forces statements in the logic if those statements are valid (provable) in that world. The forcing relation "w forces A" is usually written: "w ||- A". A valid forcing relation needs to have three properties:
  1. w ||- not A if/f not(w ||- A): if w forces not A, then it cannot also force A.
  2. w ||- A -> B if/f not(w ||- A) or (w ||- B). This is very close to one of the inference rules of normal predicate logic: A -> B if/f (not A) or B.
  3. w ||- LA if/f forall u in W (R(w,u) implies u ||- A): we can say that a statement is necessarily true in a world w if and only if that statement is true in every world accessible from w.
With that, we've now got enough to be able to define what a model is in Kripke semantics: A Kripke model for a logic L consists of three things, usually written as a triple (W, R, ||-): W is a set of worlds; R is an accessibility relation between elements of W; and ||- is a forcing relation from members of W to statements in L. Without the forcing relation, given just the worlds and the accessibility relation, that piece of a model is called a Frame.

A logical statement or formula A is true in a model (W,R,||-) if forall w in W, w ||- A.

Building Kripke Models for Logics

Suppose we have a modal logic, X. What we want in a model for X is a way of saying what statements mean in a way that shows that they're consistent - that is, that the statements that are provable in the logic are really valid inferences from the axioms of the logic.

The basic construct of kripke models for logics is called maximum consistent sets. A set of statements S is consistent in a logic X if there is no way to derive a contradiction from any combination of statements in S, axioms of X, and standard inference (if A->B and A then B) over S and axioms of X. A maximum consistent set MCS for X is a consistent set for X where MCS is not a proper subset of any other consistent set in X.

The Kripke model of a logic X is a standard Kripke model (W,R,||-), where:
  • W is the set of all MCSs for X.
  • R(A,B) if/f for all statements s in X, (L s) in A implies s in B
  • A ||- s if/f s in A
One incredibly cool property of the MCS construction of a Kripke model of a logic X is: for every statement s which can be shown to be unprovable in X, the Kripke model contains a counterexample. This will turn out to be really valuable in the model for intuitionistic logic: it means that in intuitionistic logic, we can enumerate the values for which a statement is true.

Skeptics Circle

The Skeptics Circle number 35 is up at Skeptico with a special guest blogger! Head on over and check it out, for a good laugh and a lot of good skeptical blogging.

Wednesday, May 24, 2006

Logic Fun: Intuitionistic Logic

It's time for a bit of a side-track from the series of articles about lambda calculus. Models of lambda calculus are based on something called intuitionistic logic, so I'm going to write a bit about that. It's also quite interesting in its own right.

Syntactically, intuitionistic logic looks the same as first order predicate logic. But the meanings of statements in it are often quite different. The basic idea behind intuitionistic logic is that some of the statements that you can make in propositional logic are too strong. In intuitionistic logic, a statement is only true if there is a proof that it is true; a statement that something is false means that that statement cannot be proved - so the negative statement "not P" can be read as an affirmative statement: "There is a proof that there is no proof for P".

This has the implication that many inferences that we're used to in propositional logic are not accepted in intuitionistic logic:
  • In propositional logic, the statement "X or not X" is a tautology, that is a statement that is always true: the statement means that either X is true, or X is not true. In intuitionistic logic, it is not: the statement "X or not X" means that we can prove that X is true or we can prove that X is false. X is true in intuitionistic logic if and only if we have a proof that X is true; X is false if and only if we can prove that there is no proof for X. If we do not have a proof for either the truth or the falsehood of X, then "X or not X" is not a true statement: it's possible that X is true, but we don't have a proof for it.
  • In propositional logic, we can often infer statements like "exists X : P(x)", even though we don't know what x is. In intuitionistic logic, to infer that statement, it must be possible to demonstrate at least one value for which it's true. Similarly, in order to prove that a universal statement is false, we need to be able to show a concrete counterexample that demonstrates its falsehood.
  • In both propositional and intuitionistic logic, the statement that "P -> not not P" is true. But in intuitionistic logic, "not not P -> P" is not true. Since "not P" means "We can prove that there is no proof of P", the fact that P is true (i.e., that we have a proof that P is true) means that we can prove that "not P" is false - because the fact that there is a proof for P provides a concrete counterexample to the statement that there is no proof for P - so "not not P", which says "There is no proof that there is no proof for P" is proven true.

    But going the other direction: the fact that we can't prove that there is no proof for P does not imply that there is a proof for P. Being unable to prove that there is no proof for something is a much weaker statement that saying that that thing is true.
One way to think of intuitionistic logic is to think of it as if there were actually three values that can be assigned to a predicate: it can be in one of two definite states (proved true or proved false) or it can be in an indefinite state (unproven). If a predicate P is unproven, then we can assert neither that P is true nor that P is false. But if the predicate is in an indefinite state, and we can show a proof, then we can move it to a definite state: if we can find a way to provide a proof of P, then it becomes true; or if we can show that it's unprovable (generally through a counterexample), then it becomes false.

The formal semantics of intuitionistic logic are more complicated than the semantics of predicate logic. They tend to involve things like lattice theory. In another post, I'll talk about one way of describing the semantics of intuitionistic logic called Kripke Semantics. Kripke semantics is useful not just for intuitionistic logic, but for a lot of other things, like temporal logics, modal logics, and model checking. (Wikipedia describes another version of the semantics of intuitionistic model based on something called Heyting Algebra, but I'm not familiar with it.)

Tangled Bank #54

There's a new Tangled Bank up a Coturnix. Good stuff, go read.

In a related note, since Coturnix mentions his upcoming move, I'll also mention that "Good Math, Bad Math" will also be making the jump to ScienceBlogs in early June. The misery of coping with Blogger will soon come to an end!

Tuesday, May 23, 2006

Nyah, Nyah, my infinity is bigger than yours!

One really weird thing that you can do with math is compare the sizes of infinitely large things. It's a really strange notion, but it's actually a fairly important one - and one which interesting ties together computation and pure mathematics in a very clean way.

Suppose I have two infinite sets. I can show that these two infinite sets are the same size, by showing that I can build a total one-to-one map between the two sets.

For example, take the set of even numbers, and the set of odd numbers. They're the same size - because I can create a mapping function f from even to odd: f(n) = n + 1. For every even number, it maps to exactly one odd number; for every odd number, there's exactly one even number that maps to it.

A more counter-intuitive example would be: take the set of even integers, and the set of all integers. We know that the set of even integers is a subset of the set of all integers. Intuitively, most people would guess that the set of integers is twice as large as the set of even integers. But the two sets are the same size, because I can do a perfect mapping from integer to even integers: f(n) = 2*n. Every integer maps onto one even integer; every even integer is mapped to one by integer. So they're the same size.

Ok. How about the set of rationals and the set of integers? Same size. Make a two dimensional table with the integers on both dimensions. Now, trace the diagonal: 1/1, 2/1, 1/2, 3/1, 2/2, 1/3, 4/1, 3/2, 2/3, 1/4. Delete anything you've seen before. (1/1 is the same as 2/2, so we'd delete 2/2.) That gives us a list of every possible rational number. Map each rational to its position in the list, and poof! A one-to-one mapping from rational numbers to integers.

So, what's bigger?

The set of irrational numbers is larger than the set of rational numbers.

To remind anyone who doesn't remember: a rational number is either an integer, or a real number that can be represented as the ratio of two integers. So 1/2, 17/4, etc., are all rational numbers. On the other hand, pi, the square root of two, the root of the natural logarithm, and many other things are irrational numbers - numbers that can't be represented by a fraction. When you write in decimal expanded form, a rational number will always end in a repeating sequence of digits. 1/3 is 0.3333333 - the three repeats forever. 1/2 is 0.500000000... the 0 repeats forever. An irrational number will go on forever, never going into a continual repeating pattern. Here, you can find the first 4 million digits of pi.

How can I show that the set of irrationals is larger than the set of rationals?

Well, basically, I can't define a map from integers to irrationals. Irrationals are weird buggers, and I can't really say where they all are. They don't lay out nice and evenly the way that the rationals do.

In fact, it turns out that no matter how I try to form a map from integers to irrationals, I'll be missing something. Here's a quick version of why. It's a classic diagonalization argument.

Let's look at the set of all real numbers between zero and one. Now, suppose that this set is the same size as the set of integers. This would mean that there is a mapping from integers to real numbers between zero and one.

If that's true, then we can, in theory, make a list of the decimal expansions of all of the real numbers between zero and one.

Then we can define a function, r(n), which gives us the "r"th digit of the "r"th real number in the list.

Now here's where we pull a fast one. We define another function i(n). If r(n) = 1, then i(n) = 2. If r(n) != 1, then i(n) = 1. If we lay out a new number 0.(i(1),i(2),i(3),i(4),...), then that number is different from any number in our list of reals in at least one digit. So we have a real number between zero and one - and it's not in our list. Contradiction, therefore the list can't exist. (Note, this was corrected to fix a typo; originally, I accidentally wrote "0.(r(1),...)". Thanks to the anonymous commenter who pointed it out.)

So, now we know that the set of real numbers between zero and one is larger than the set of rational numbers between zero and one. But since we know that the set of rationals is the same size as the set of integers, then that means that the only way that the set of reals can be larger than the set of integers is if the set of irrationals is larger than the set of rationals.

How does this connect to computation? Well, there's a notion of something called a countable set. A countable set is a set which is either finite, or which has a one-to-one mapping from the integers. It turns out that the set of countable sets is a superset of the set of recursively enumerable sets. Infinite sets larger than the countable sets cannot be enumerated by any computing device. They are fundamentally non-computable things. (The last paragraph was heavily edited to correct an error pointed out in the comments. See the comment thread for the details. As usual, thanks to the commenters who pointed out the errors!)

Monday, May 22, 2006

RePEARing Bad Math

I was looking at the site-meter statistics for this site, and who was referring to it, and I came across a discussion linking to the Global Consciousness Project. The GCP is a very flaky train-wreck which is trying to measure the purported cumulative affects of human consciousness on, well, on any old random thing they can think of. It turns out that the GCP is another face of PEAR at Princeton. Man, what is it with Princeton?

What GCP does is run the PEAR random number generators non-stop, all the time, logging the results. Then when there's an "interesting" event in the world, they got back to the logs, and see if they can find any "anomalous patterns" in the data, where anomalous patterns are short periods of time where the mean of the random numbers is different from the normal expected mean.

Here's an example of what they do, in their own words, from here:
The first formal prediction for the EGG project was made by RDN, in 1998-08-08, while traveling in Halifax. It concerned the Embassy bombings in Nairobi and Tanzania, on 1998-08-07 at 07:35 UTC. Upon my return home, prior to examining data, I made the specific predictions described below. These terrorist attacks exemplify a tearing of the social fabric that would shock a global consciousness temporarily. For this particular event, we had only the general outline for predictions: a period of time, say 10 minutes, surrounding the point event, and a period, say a few hours, following the event during which the world becomes conscious of what has happened.

An exploratory analysis based on 15-minute data segments from three eggs looked at the event-period from 07:15 to 07:45, and a three-hour consciousness-spreading period from 07:15 to 10:00. The associated probabilities indicate significant deviations for both time-periods. At that time we did not have sophisticated processing capabilities, but a hand calculation could be made using the automatically generated Z-scores for 15-minute blocks. It indicated significant deviations in both the short period defined for the event: Chi-square 18.039, 9 df, p=0.035, and for the aftermath, inclusive of the event: Chi-square 69.536, 36 df, p=0.00066.
So: they pick an event that they believe should affect the "global consciousness". Then they take short time periods associated with the event, and search for a time period where the mean value from their random number generator is not exactly what it should be.

They have an extensive list of events on their site, which they claim are used "rigorously" to test whether there are anomalous patterns associated with events. For most of them, they were able to discover one second intervals that were "anomalous".

What's wrong with this?

Given a huge database consisting of sequences of random numbers, you expect to see small deviations. If it never deviated from the mean, you would actually conclude that the data was fake; you expect to see some fuzz. Now, take that huge quantity of data (they're generating 200 bits per second at each of 98 different random number generators); and take a swath of time (minutes to hours) associated with an "event", and see if you can find a one-second period of time for which the random numbers deviate from the expected mean. For any event, at any point in time, you can probably find a "significant" deviation for a couple of seconds; and they consider one second enough to be meaningful.

They actually make data from their generators available through their website; when I get a chance, I'm going to try to prove my point by trying a few dates that have no particular significance to anyone but me: the times my children were born (7/30/2000, 9:35am, and 4/10/2003, 8:40pm), the tenth anniversary of my wedding (the specific 20 minutes of the ceremony) (6/5/2004, 2-2:20pm), and the only time my dog every bit anyone (that being me, for freaking out while he vomited all over my house on, I think, 7/10/1998, around 5pm).

I'll let you know the results.

Probability and Fine Tuning

The ways that creationists play with "big numbers" games never ceases to amaze me. Over the weekend, I came across yet another extremely silly attempt to show that life on earth is so improbable that it must be the result of divine intervention.

The reason that I'm mentioning this one is that it's a different approach to generating the big number. Instead of sticking with the usual argument about the supposed probability of abiogenesis on earth, it tries to use a probabilistic version of the fine-tuning argument: that there are specific properties of the universe as a whole, and of the earth in particular, that make life possible, and that the likelihood of all of those properties having the correct values is so unlikely that they must be the result of deliberate divine intervention to make life on earth possible.

This bugger is just chock-full of problems. The most important one is what I call the fake numbers error: most of the numbers are just made up. For example, they cite the "local abundance and distribution of dark matter" as a property that must have a specific value, and state that the probability that this will fall within the range required for life as "0.1". At present, we don't know precisely what dark matter is, what properties it has (beyond the fact that it has a gravitational influence on bright matter), or how it's distributed. So claiming that it's distribution is a required property of the universe for life is silly, and assigning a meaningful specific probability number for that distribution matching the current unknown value is impossible. Many of their "parameters" for life are like this: they're just numbers pulled at random out of a hat. They've got 322 factors: my estimate is that at least 90% of those are fabricated. What's worse is that many of those aren't just made up numbers; they're utterly silly as parameters for life. I mean, "infall of buckminsterfullerenes from interplanetary and interstellar space upon surface of planet"?

They also make liberal use of the false independence problem. This is where you take two related factors, but treat them as if they were unrelated, in order to inflate your probabilities. A trivial example of this kind of problem is: what's the probability of flipping a coin twice, and getting heads both times? 1/4. What's the probability of flipping a coin three times, and getting heads all three times? 1/8. Good so far. Now, suppose that I've already flipped a coin twice, and gotten heads both times. What's the probability that I'm going to get 4 heads in a row? A false independence would say: 1/16th - because it ignores the fact that flipping heads twice affects the probability of flipping heads four times in a row: I'm already halfway there.

Examples of this from their list: they take "relative abundances of different exotic mass particles", and "decay rates of different exotic mass particles" as independent factors. They're not independent: the abundances of the exotic particles and their decay rates are both dependent on the properties of the universe that allow those particles to exist. Or even more obvious: "proximity to supernova", "timing of supernova", and "supernova rates and timing" as three different parameters. They play this game over and over again, creating lists of factors to inflate their improbabilities, even those those factors are not independent.

What's particularly humorous is the number of times that they include "parameters" for the probability of life on earth that are based on the existence of life on earth! By my count, they have at least 18 factors that explicitly depend on the existence of life on earth in order to argue that life on earth is improbable! (9 parameters specify quanities of certain kinds of bacteria, 1 specifying quantity of fungi, one specifying quantity and timing of vascular plants, 1 specifying quantity and timing of carbonate producing animals, etc.) In all, they argue that the existence of life on earth is responsible for a probability factor of "1/10^42" against the existence of life on earth! Yeah, the existence of life on earth sure does make for a strong case that life on earth is unlikely!

They also mix in a bunch of earth-specific factors ("position and mass of Jupiter relative to earth", "amount of outward migration of Neptune", etc.); but they claim to be computing the probability of life existing around any potential life-supporting body in the universe. Yep, the odds of any other earth-like planet having Jupiter as a neighbor is pretty unlikely all right!

They go on, and on, and on like this - with barely a single valid number in the entire exercise, to conclude:

Thus, less than 1 chance in 10^282(million trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion trillion) exists that even one such life-support body would occur anywhere in the universe without invoking divine miracles.

Sunday, May 21, 2006

Magic 23

Since I posted a couple of items about gematria, and the kinds of gibberish that it can be used to create, I've been receiving tons of emails with people pointing me at silly gematria sites on the net. I had no idea this gibberish was so widespread!

In general, it's all the same kind of thing: stream-of-consciousness rambling, using numbers the guide the streams. It's very funny in its way, but it gets dull pretty quickly: read one of 'em, and you might as well have read them all.

But once in a while, you find a prize: something that's different, special in its own uniquely ridiculous way. And an anonymous person send me a link to one of those. This is really fantastically goofy, and there's a whole group of people who are actively obsessed with it. It's the "235" gang. They believe that there are all sorts of magical connections involving the numbers "2" and "3"; some also include "5", but others claim that 5 isn't special on its own, because it's just the sum of 2 and 3. A couple of their writeups are:
The first one is just the usual kind of gematriacal rambling, but it's got the most exhaustive list of the 235s that I could find. The second one is better; it's just chock-full of crunchy insanity.

Here's how it starts out:
For many years, we have been told by Egyptologists and historians that the pyramids of ancient Egypt were the tombs of the Pharaohs and nothing more.

This is the ‘official line’ and this could indeed be true as regards the majority of the pyramids in Egypt which were clearly built by the ancient Egyptians – many of which collapsed to rubble soon after being built.

But ask the many researchers (those not funded by institutions) who have made a life time study of the pyramids, and they would most likely tell you that the Great Pyramid of Giza is unique – not just because it is a perfect pyramid which has withstood the ravages of time – but because it appears that one of its purposes was to preserve knowledge.

They will also tell you that much meaningful information has already been derived from the Great Pyramid, even though many scholars and academics – i.e., those funded by the institutions and establishments – have never really accepted this data and continue to ignore it.

During the last months of 2002, I made a new major discovery about the Great Pyramid of Giza: that the Polar or Celestial Axis of the Earth, which is presently tilted at 23.43 degrees, (given the round figure of 23.5º in most text books on the subject) is clearly referenced in the geometrical structure of the Pyramid.
Such a bundle of silliness, just in the first 5 paragraphs!
  • The conspiracy! All of those people "funded by institutions" are trying to cover up the truth!
  • The mounds of "data" that everyone ignores!
  • The lone genius making spectacular discoveries all by himself!
  • The "rounding" of the angle of the earth's axis and the position of the Giza pyramid to get the magical "5" in, even though the egyptians did not use decimal angle measurements! (To the best of our knowledge, the egyptians used mathematical units derived from the babylonians, which is where the "degree" unit comes from; the babylonians used a sort of base-60 math - thus the 360 degrees, 60 minutes, 60 seconds angle measurements.)
Ah, but it gets much better. Because, you see, the conspiracy is trying to cover up the true history of the earth! But initiates of the great mysteries throughout history have been trying to provide hints! In the pyramids, in paintings, in maps, in the layouts of cities!
We also find that the Great Pyramid is actually pointing to its own location on the earth – being almost 30º N from the Equator, and this is the decisive factor for me as regards the credibility of this encoded information.
See, the pyramid is pointing out its own location, because it's got this angle of 23.5 degrees, which is the three magic numbers 2, 3, and 5; and it's 30 degrees north, which is not just 10 times the magic number of three, but it's also the angles of the "perfect" triangles that make up the sides of the pyramid!
This new discovery about the Great Pyramid is remarkable in itself; but how I actually uncovered this hidden information never fails to amaze me – even now as I reflect on it – as it involved a 17th century painter who on the surface appears to have had nothing to do with Egypt and the pyramids of Giza . . . an artist whose works of art are believed to contain a code which many researchers have endeavored for years to decipher with no real success.

It would surprise many to learn that I first discovered this information about the Great Pyramid through studying and deciphering the hidden codes in three paintings by the French artist Nicolas Poussin (1594–1665). These initial discoveries and my own interpretation of the data will be revealed for the first time in my book Axis of God.
And of course, we can't have all this yummy insanity without a sales pitch! See, he discovered this secret about the great pyramid because of the paintings of a french artist; and all you need to do to see how he discovered it is to buy his book!
However, the truth is, that the paintings or sources I have analyzed are all associated by theme, and also contain deliberately planned, conspicuous linear items and features that are just “begging to be measured” so as to obtain the angles being presented.
Hmmm... He sees "deliberately planned, conspicuous linear items that are begging to be measured" in all sorts of paintings? I wonder what Freud would think of that?
For the sake of balancing the argument and to try to prove myself wrong, I also looked into the fact that the angle of 22.5 degrees is also a common angle used in art for perspective, as 22.5º is one sixteenth of a 360º circle. I mention this so that others would be aware of this and could perhaps use this in their argument and so test this themselves if interested.
I must point out that I have examined many paintings, and have found that although some of the overall ‘perspective features’ in some paintings are in the ‘right ball-park’ and could therefore be associated with the 22.5º angle of perspective so as to support this given fact, I have found that many suspect features, such as staffs, poles, flagpoles, spears, lances, clubs, bones, limbs, swords and other linear objects, are definitely at an angle of 23.5º or even 23 degrees.
So, by carefully examining paintings, he finds that the linear items begging to be measured are all quite definitely at an angle of 23 or 23.5 degrees - and in a painting, he can clearly distinguish 22.5, 23, and 23.5 degrees. Hmm... Just for kicks, I decided to see how hard it would be to do that. Here's a little diagram, showing three overlapping angles: 22.5, 23, and 23.5 degrees:



Gosh, he's good. He can distinguish between those three angles in a painting! Wow!

But the meaning of the magical 23.5 is deeper than that. After all, what good would a conspiracy be, if there weren't a reason behind it?
But it is at this point that we may ask, ‘why’ is the earth’s inclined axis being referenced in all these sources? After all, some believe, and are of the opinion, that the tilt of the earth’s axis is a ‘natural phenomenon’, and that the 23.5º tilt is necessary as the summer and winter cycles govern life on our planet. But, if this is true, then why do we find references to the angle of 23.5º in paintings on the theme of death, battle, war and conflict?

What exactly did the ‘initiate’ artists know – those who gave reference to this angle? . . What were these people trying to tell us?

My own interpretation, based on a lot more evidence which I have found and have yet to present, is that it was believed by the ‘initiated’ that the tilt of the earth isn’t natural, nor is it an ideal situation. Indeed it is a fact that the early Greek philosophers considered the tilt of the earth to be an unnatural and “irregular condition” and not something that had been set in motion since the beginning of creation.
Yes, he's not just a gematria loony, he's a catastrophist, like the electric universe gang! The reason for the magical numbers 235 showing up all of the time are because we fell from paradise when the earths axis tilted from its perfect position to an angle of 23.5 degrees!
The more I researched into these discoveries; more remarkable data began to present itself: I found that the angle of 52 degrees is also referenced in these same sources and is almost as frequent as the 23.5-degree references.
This is interesting because the significant thing about the angle of 52º is that this is the angle of the inclined sides of the Great Pyramid of Giza . . .
See, 52 degrees is also part of the magical numbers of the great pyramid, and if you can find anything, anything at all that includes something with a measurement that looks like either 23 or 52 degrees, well then, it's obviously a reference to the hidden knowledge of the pyramid by initiates of secret knowledge. By why would they need to hide it?
The Catholic Church was very powerful during the Dark Ages and the Middle Ages – and really until the dawning of Science and the so-called Age of Enlightenment; so to be branded a heretic was very dangerous to one’s health.
Those who knew the truth behind the Church and Christianity would have had to be seen to “toe the line” and so information and knowledge which would have been seen as “heretical” and otherwise suppressed and stamped-out by the Church, would have been encoded – not only to preserve it, but also to secretly pass this knowledge onto others.
Of course! What good is a conspiracy if we can't connect it to the catholic church? You see, the magic numbers referred to hidden knowledge that predates Jesus, and so the church is out to get anyone who tries to talk about it. So, all of those initiates go and hide it in plain sight, so that the church won't notice it, but everyone else will. Because, you see, conspiracies that manage to keep secrets from the general population for thousands of years are incapable of noticing symbolisms which will be obvious to everyone else.

But you know what this conspiracy theory about the great magic numbers of the pyramid is missing? The Masons and the Illuminati! Every good conspiracy has to include them, doesn't it? Ah, don't you start worrying now, he's got it covered!
We find the angle of 23.5º and 52º repeated in Templar and Masonic symbolism – such as the Templar Cross Patee or Maltese Cross and also the Masonic Compass and Set-square which we will come to later. Most surprisingly we find the angle of 23.5º in Leonardo Da Vinci’s famous drawing, Vitruvian Man.

The angle of 23.5º is even referenced in one of the most infamous symbols of them all – the “All-Seeing Eye” in the detached and floating capstone of the pyramid from which rays of light radiate – which is really a reference to the Great Pyramid of Giza which is missing its capstone. This symbol is on every US Dollar bill and it’s amazing how people could have missed this.
Ok, Masons, check. DaVinci, check.

Aw, shucks. No illuminati. Hmmm... Wait a second, let's see if this author talks about the connection between 23 and the Illuminati somewhere else. Gosh, that other 23 theorist I linked to? He references another book by the author of this pyramid stuff, called "REVELATIONS":
Classic conspiracy authors Robert Shea and Robert Anton Wilson (RAW) account the significance of the number 23 as it appeared with great synchronicity in the weird and multi-dimensional voyage of their 'Illuminatus' books. In fact, there might be something much deeper to this message of 23 than the paranoid induced theories of tin foil hat wearing kooks. As we will see, there appears to be a secret message, a secret knowledge, encoded in the number 23.5, among many other recognized in Kabbalistic teachings and numerology.

Let's look at the Bavarian Order of the Illuminati, founded in 1776 by Jesuit trained Adam Weishaupt. There are many theories due to its founding in 1776 coinciding with the American signing of the Declaration of Independence, that Weishaupt and first President George Washington were in fact the same man. Whether they were indeed one in the same, or used as dopplegangers to each other, there are some very strange connections in regards to the 23 when comparing the life and death of these two individuals.

George Washington
Born: Feb 11th 1732
Death: Dec 14th 1799

Adam Weishaupt
Born: Feb 6th 1748
Death: Nov 18th 1811

Washington and Weishaupt were born 5 days, 16 years apart. They died 26 days, 12 years apart. If you add together 5+1+6+2+6+1+2 it again comes to the 'magic' number of 23. But this is just the beginning in an intricate web of deceptive metaphysics that makes up not only the history of the United States of America, but also takes us into the very core and foundation of the Secret Society and religious monument building.

For those interested with delving further into the relations of the 23 alignments, you can find more details at Gary Osborn's website; in a fascinating excerpt from his latest book entitled REVELATIONS. It sheds light on the possible meanings of this prominent and misunderstood number, detailing overlooked pieces of the puzzle which will be uncovered in his 'Axis of God' release, due in early 2007. Gary is also the co-author of underground best-selling 'The Shining Ones', and 'The Serpent Grail'.
Yep, we got our Illuminati connection! Along with the assertion that this is all sane stuff, not any of the tinfoil hat lunacy!

Not loony enough for your tastes yet? Well... take a look at one of the "supporting references" for the 23 gibberish here. Yes indeedy, 23 is even connected to H. P. Lovecraft and Cthulhu!
H. P. Lovecraft, who first wrote of the EOD, entered the Jane Brown Memorial Hospital where he died 5 days later, on 10/2/1937. The numbers in date of HP Lovecraft's leaving this world add numerologically to 23! 1 + 0 + 2 + 1 + 9 + 3 + 7 = 23.

In the H. P. Lovecraft tale, "The Call of Cthulhu," on March 23rd the crew of the Emma landed on an unknown island and left six men dead; and on that date the dreams of sensitive men assumed a heightened vividness and darkened with dread of a giant monster's malign pursuit, whilst an architect had gone mad and a sculptor had lapsed suddenly into delirium! It was on the 23rd that the sculptor, Wilcox, was struck with fever. It was on the 23rd that the sense impacts from Cthulhu were heard and sensitive people heard the Deep Ones chanting. It is of quite pivotal importance to the story. 23 = the number of Cthulhu's influence.

Human beings have 23 pairs of chromosomes. Anything humans might be able to mate with would also have 23 pairs of chromosomes.

About 100 years ago, Dr. Hermann Swoboda, professor of psychology at the University Of Vienna, became interested in the study of cyclical, internal changes affecting humans. He concluded that there were 2 basic rhythms in humans, a 23 day physical cycle is one of them (the other is a 28 day emotional cycle). Since Cthulhu does not have human emotions, It would only have a physical cycle, even if It is not made of matter we can understand. The physical cycle of Cthulhu is 23. In our 3 dimensions of space as we measure time on Earth, that's what it would be.
I can't take it anymore! I think I need to go have a beer. Wait a second - the beer bottles that I made my homebrew stout in... They're 23 ounce bottles!

Friday, May 19, 2006

Programming SKI

As an interesting followup to the post on combinators: an extremely crazy individual actually implemented a SKI combinator calculus interpreter called Lazy K. And what's even crazier is, he implemented the Sieve of Eristathenes in LazyK. Here it is:


K
(SII(S(K(S(S(K(SII(S(S(KS)(S(K(S(KS)))(S(K(S(S(KS)(SS(S(S(KS)K))(KK)))))
(S(S(KS)(S(KK)(S(KS)(S(S(KS)(S(KK)(S(KS)(S(S(KS)(S(KK)(SII)))
(K(SI(KK)))))))(K(S(K(S(S(KS)(S(K(SI))(S(KK)(S(K(S(S(KS)K)(S(S(KS)K)I)
(S(SII)I(S(S(KS)K)I)(S(S(KS)K)))))(SI(K(KI)))))))))(S(KK)K)))))))(K(S(KK)
(S(SI(K(S(S(S(S(SSK(SI(K(KI))))(K(S(S(KS)K)I(S(S(KS)K)(S(S(KS)K)I))
(S(K(S(SI(K(KI)))))K)(KK))))(KK))(S(S(KS)(S(K(SI))(S(KK)(S(K(S(S(KS)K)))
(SI(KK))))))(K(K(KI)))))(S(S(KS)(S(K(SI))(SS(SI)(KK))))(S(KK)
(S(K(S(S(KS)K)))(SI(K(KI)))))))))(K(K(KI))))))))))(K(KI)))))(SI(KK)))))
(S(K(S(K(S(K(S(SI(K(S(K(S(S(KS)K)I))(S(SII)I(S(S(KS)K)I)))))))K))))
(S(S(KS)(S(KK)(SII)))(K(SI(K(KI)))))))(SII(S(K(S(S(KS)(S(K(S(S(SI(KK))
(KI))))(SS(S(S(KS)(S(KK)(S(KS)(S(K(SI))K)))))(KK))))))(S(S(KS)
(S(K(S(KS)))(S(K(S(KK)))(S(S(KS)(S(KK)(SII)))(K(S(S(KS)K)))))))(K(S(S(KS)
(S(K(S(S(SI(KK))(KI))))(S(KK)(S(K(SII(S(K(S(S(KS)(S(K(S(K(S(S(KS)(S(KK)
(S(KS)(S(K(SI))K))))(KK)))))(S(S(KS)(S(KK)(S(K(SI(KK)))(SI(KK)))))
(K(SI(KK))))))))(S(S(KS)(S(K(S(KS)))(S(K(S(KK)))(S(S(KS)(S(KK)(SII)))
(K(SI(K(KI))))))))(K(K(SI(K(KI)))))))))(S(K(SII))(S(K(S(K(SI(K(KI))))))
(S(S(KS)(S(KK)(SI(K(S(K(S(SI(K(KI)))))K)))))(K(S(K(S(SI(KK))))
(S(KK)(SII)))))))))))(K(SI(K(KI))))))))(S(S(KS)K)I)
(SII(S(K(S(K(S(SI(K(KI)))))K))(SII)))))

From Lambda calculus to Combinator Calculus

After yesterdays description of the Y combinator in lambda calculus, I thought it would be fun to show some fun and useful stuff that you can do using combinators.

Let's take three simple combinators:
  1. S: S is a function application combinator: S = lambda x y z . (x z (y z)).
  2. K: K generates functions that return a specific constant value: K = lambda x . (lambda y . x).
  3. I: I is the identity function: I = lambda x . x
At first glance, these look like a very strange selection. In particular, S is an odd application mechanism - rather than taking two parameters, x and y, and applying x to y, it takes two functions x and y and a third value z, and applies the result of applying x to z to the result of applying y to z.

But there's a reason. Watch this: each line in the following is one step of reduction:

S K K x =
(K x) (K x) =
x


Poof! We don't need I at all. We just created the equivalent of I using nothing but S and K. But that's just the start: in fact, we can create the equivalent of any lambda calculus expression using nothing but the S and K combinators with no variables.

For example, the Y combinator is:

Y = S S K (S (K (S S (S (S S K)))) K)

Before we go any further, there's one important thing to point out. Note that above, I said that with S K K we created the equivalent of I. It does not reduce to lambda x . x.

So far in lambda calculus, we've said that "x = y" if and only if x and y are either identical, or can be made identical through alpha conversion. (So lambda x y . x + y is equal to lambda a b . a + b, but not equal to lambda x y . y + x.) This is called intensional equivalence. But it's extremely useful to have another idea of equality, which is called extensional equivalence or extensional equality. In extensional equality, an expression X equals an expression Y if/f either X is identical to Y (modulo alphas), or for all a . X a = Y a.

From now on, when we use "=", it will be for extensional equality. We can translate any lambda expression into an extensionally equivalent combinator term. We'll define a transform function C from lambda terms to combinator terms:
  • (1) C{x} = x
  • (2) C{E1 E2} = C{E1} C{E2}
  • (3) C{lambda x . E} = K C{E} if x is not free in E.
  • (4) C{lambda x . x} = I
  • (5) C{lambda x. E1 E2} = (S C{lambda x.E1} C{lambda x.E2})
  • (6) C{lambda x.(lambda y. E)} = C{lambda x. C{lambda y.E}}, if x is free in E.
Let's try a walkthrough:
  • C{lambda x y . y x}
  • Curry the function: C{lambda x . (lambda y . y x)}
  • By rule 6: C{lambda x . C{lambda y . y x}}
  • By rule 5: C{lambda x . S C{lambda y. y} C{lambda y . x}}
  • By rule 4: C{lambda x . S I C{lambda y . x}}
  • By rule 3: C{lambda x . S I (K C{x})}
  • By rule 1: C{lambda x . S I (K x)}
  • By rule 5: S C{lambda x . S I} C{lambda x . (K x)}
  • By rule 3: S (K (S I)) C{lambda x . K x}
  • By rule 5: S (K (S I)) (S C{lambda x . K} C{lambda x . x})
  • By rule 1: S (K (S I)) (S C{lambda x . K} I)
  • By rule 3: S (K (S I)) (S (K K) I)
Now, let's try using "x" and "y" as parameters to that combinator expression, and reduce:
  • S (K (S I)) (S (K K) I) x y
  • Let's create some aliases, to make things easier to read: A=(K (S I)), B = (S (K K) I), so our expression is now: S A B x y
  • Expand S: (A x (B x)) y
  • Let's unalias B: (A x ((S (K K) I) x)) y
  • Now, let's get rid of the S: (A x ((K K) x (I x))) y
  • And get rid of the I: (A x ((K K) x x)) y
  • Reduce (K K) x: (A x (K x)) y
  • Expand out the A alias: ((K (S I)) x (K x)) y
  • Reduce (K (S I)) x, giving: ((S I) (K x)) y
  • Reduce S: I y (K x) y
  • Reduce I: y (K x) y
  • And finally reduce (K x) y leaving: y x
And there we go. Fun, huh?