Jonathan Stoler | How much Javascript can we write without using English?

Like all the best things in life, this post is inspired by a silly joke:

Crimes,

@agafnd@www.librepunk.club

what programming language did web browsers use a long time ago, in a galaxy far, far away

Jabbascript

This got me thinking, how much of the Javascript language can we bootstrap in Javascript without writing any English? Or, taken to the extreme, how far can we get in Javascript without typing a single letter? That means no keywords, no identifiers, no alphabetical characters in strings, etc. As it turns out, we don't need numbers either.

My original plan was to create a "dialect" of Javascript where all the keywords were redefined in Huttese, to satisfy the joke... but turns out Huttese doesn't have words for things like "function" or "array." I'm compromising to use English identifiers for all the language constructs. To make it clear, all my identifiers will start with $ and be ALL_CAPS. Temporary variables will also start with $ but will be lower_case.

You can imagine everything being named like $_ or $__ to satisfy the rules, but that would be impossible to decipher, so we're going with readable names. Just remember, the dollar sign means it's just a variable binding and could be exchanged for an entirely different name without breaking anything.

Step 0: Building Blocks

Where do we even begin with this? What can we create with no letters or numbers? Actually, quite a bit.

// === Things we can create in JS without letters or numbers ===

// == Strings ==
// empty string (VERY useful)
""
// string with non-alphanumeric characters (not useful)
"[]{}()$!_"

// == Regular Expressions ==
// regular expression with non-alphanumeric characters (not useful)
/_/

// == Arrays ==
// empty array
[]
// array of other items listed here (not useful)
[[], {}]

// == Objects ==
// empty object
{}
// object with non-alphanumeric keys (not useful)
{["[]"]: []}

// == Functions ==
// anonymous arrow functions (VERY useful)
() => {}
// we can even use parameters if we want to (not useful)
($_, $__) => { $_ + $__ }

// == Operations ==
// (most of these are unused, or equivalent for our needs)
// +, -, *, /, %, **
// >>, <<
// [] indexing
// !, |, &, ^, ~
// !=, ==, ===
// >, <, >=, <=
// = assignment (var is implied, but beware global scope)

Step 1: Booleans

The easiest things to construct from nothing are the booleans. Just use an equality check.

$FALSE = [] === []; // => false
$TRUE  = !$FALSE;  // => true

Interestingly, we start with false here because [] !== [] and {} !== {}. Objects, which include arrays, regexps, and functions - the only primitives we have access to - are compared by reference, not by value.

We could create true directly using something like $tmp = {}; $tmp == $tmp but why use an assignment if we don't have to? (See note about global scope above!)

Step 2: Undefined

Easy, just access a property that doesn't exist. Like true in true.

$UNDEFINED = $TRUE[$TRUE]; // => undefined

Step 3: Numbers

We can convert our booleans into numbers - false will coerce into 0 and true will coerce into 1. We can force this coercion using a unary operator.

$ZERO = +$FALSE; // => 0
$ONE  = +$TRUE;  // => 1

From there, we can create any number we want:

$TWO = $ONE + $ONE;
$THREE = $TWO + $ONE;
// etc

The highest number we need to use for the rest of this procedure is 6.

We can also create NaN and Infinity, though we don't actually need either.

$INFINITY = $ONE / $ZERO;  // => Infinity
$NAN      = $ZERO / $ZERO; // => NaN

Note: from now on, I will be using number literals in values and strings. I've proven I can create any number, so this is done for readability purposes.

Step 4: Null

null is the last primitive value we need to obtain, and it's also the hardest. Unlike the other primitives, null can only be obtained by creating a function that returns null. We can't use type coercion to get null and there are no built-in functions that return it.

The closest we can get is to use a regular expression: /_/.exec() can return null - but that requires us to write out "exec" which goes against our rules. So we have to take a different approach. If we can generate the string "null", we can run JSON.parse("null") to get a null primitive back.

Step 4.1: JSON from where?

Now we have a strategy to generate null, but we can't just write out JSON.parse since that uses letters. So we need a way to construct and execute the string JSON.parse. Boy does that have a lot of capital letters! That's going to be hard to generate from nothing, but let's just proceed under the assumption that it's possible.

If we are able to generate "JSON" and "parse" we can use indexing to access globalThis["JSON"]["parse"].

But we're going to run into the same problem with globalThis! How do we access the value of a global variable without typing out its name?

Step 4.2: Eval

Ah, XSS hackers rejoice! We finally found a use for eval. We can use eval to execute arbitrary code passed in as a string, which would let us run eval("globalThis") or eval("JSON.parse").

There's two problems with this approach. First, if we have access to eval, we can just use eval("null") which is much shorter and less capitalized than globalThis or JSON.parse. Second, and more importantly, we're still confronted with the same chicken-and-egg problem as before. eval is basically a global binding, so even if we are able to generate the string "eval", we have no way to actually access the bound function without writing letters. eval("eval") violates our rules.

Luckily there's another way to run arbitrary code.

Step 4.3: Function construction

In Javascript, functions are objects.

function myFunction() {
    return true;
}
myFunction.property = "value";

Further still, functions are objects of type Function. Below, all are equivalent.

function f() { return true; }

f = function() { return true; }

f = () => true

f = new Function("return true")

(There are some important differences in variable scope handling depending on how you create your function... but our particular example doesn't use any variables, so they are equivalent.)

And look at that last example! We just executed arbitrary code passed in as a string! That's exactly what we want to do to get a null primitive back.

Two potential problems: both new and Function use letters, and we run into the same chicken-and-egg problem as before.

However, new is optional in this situation... so we just have to find a way to access the Function value without actually writing it. We have access to arrow functions, so that seems like a good place to start. Sure enough, we can get the Function value back from (() => {}).constructor. Since we can't write out the constructor property to access it, we'll have to use a string with indexing like (() => {})["constructor"].

So, next step: generate the string "constructor".

Step 4.4: Let's coerce a bunch of strings!

We can coerce all the primitives we currently have access to into strings, then pull out their individual characters:

"" + $TRUE      // => "true"
"" + $FALSE     // => "false"
"" + {}         // => "[object Object]"
"" + $UNDEFINED // => "undefined"

// unused, but available
"" + $NAN      // => "NaN"
"" + $INFINITY // => "Infinity"

This gives us the following alphabet: INO[]abcdefijlnorstuy. That has everything we need to generate the string "constructor":

$tmp_str_constructor = ("" + {})[5] + ("" + {})[1] + ("" + $UNDEFINED)[1] + ("" + $FALSE)[3] + ("" + $TRUE)[0] + ("" + $TRUE)[1] + ("" + $UNDEFINED)[0] + ("" + {})[5] + ("" + $TRUE)[0] + ("" + {})[1] + ("" + $TRUE)[1];
// => "constructor"

$FUNCTION = (() => {})[$tmp_str_constructor]; // => [Function: Function]

Step 4.5: null at last!

Finally, we can generate null.

$tmp_str_return = ("" + $TRUE)[1] + ("" + $TRUE)[3] + ("" + $TRUE)[0] + ("" + $TRUE)[2] + ("" + $TRUE)[1] + ("" + $UNDEFINED)[1]; // => "return"
$tmp_str_null   = ("" + $UNDEFINED)[1] + ("" + $TRUE)[2] + ("" + $FALSE)[2] + ("" + $FALSE)[2]; // => "null"

$NULL = $FUNCTION($tmp_str_return + " " + $tmp_str_null)(); // => null

Step 5: End?

At this point, we can run anything we can encode into a string... including other strings. Using unicode escaping, we can do the following:

$tmp_str_escape = "\\" + ("" + $TRUE)[2]; // => "\u"
$letter_a = $FUNCTION($tmp_str_return + " '" + $tmp_str_escape + "0041'")(); // => "A"
$letter_b = $FUNCTION($tmp_str_return + " '" + $tmp_str_escape + "0042'")(); // => "B"
// etc

Now we can build a complete alphabet, generate any string, and access any global using string construction and $FUNCTION. So we can run any normal Javascript code, we just have to convert it to a string first. We're done, theoretically.

But that's not really the point. Nobody wants to write Javascript using strings... and besides, the origin of this expedition was trying to translate Javascript into Huttese. If we're just running Javascript strings, our keywords will still be in English. Not good enough. Let's keep going.

Step 6: Conditionals

Note: from this point on, I will be using alphanumeric strings. I've shown that I can construct any arbitrary character, and it will be easier for both of us if I could type "abc" instead of $letter_a + $letter_b + $letter_c every time.

It's pretty easy to use the ternary to create a replacement if using only symbols:

$IF = ($condition, $body) => { ($condition ? $body : (() => {}))(); }

That's the basic idea, but it has a few issues. First, it evaluates the condition every time. There's no short-circuiting. Second, it doesn't get us anything like else. Let's touch it up a little:

$IF = ($condition, $body) => $condition() ? ($body(), {
            ["$ELSE"]: () => {},
            ["$ELSEIF"]: ($condition, $body) => $IF(() => $TRUE, () => {})
        })
    : {
        ["$ELSE"]: ($body) => { $body() },
        ["$ELSEIF"]: ($condition, $body) => $IF($condition, $body),
    }

Before we get into how this actually works, I have to comment on this weird syntax. We still can't use the keyword return (in fact, we'll never be able to!) so we have to use arrow functions' implicit return. Arrow functions only implicitly return if they are made up of a single statement.

Technically the body of $IF is just a single ternary condition anyway, but each branch becomes a little complicated. To run $body() before returning the ? branch, we use our good ol' friend the comma operator. Actually, this is how we have to write any function that returns things.

Here's how our $IF function actually works. Consider the following:

$IF(() => $condition_1, () => "option 1")
.$ELSEIF(() => $condition_2, () => "option 2")
.$ELSEIF(() => $condition_3, () => "option 3")
.$ELSE(() => "option 4")

If $condition_1 is false, we don't want to execute option 1. That's why we see no $body() call in the : piece of our $IF ternary. Then, we return an $ELSE that automatically runs its body and doesn't return anything. This ensures $ELSE is the last statement in the conditional chain. We also return an $ELSEIF which repeats the same $IF logic over again and returns the result. If $condition_2 is false, we will fall through to the second $ELSEIF. If $condition_3 is false, we will hit the $ELSE and run option 4.

Alternatively, if $condition_1 is true, we will run $body() then return an $ELSE which does nothing, and an $ELSEIF which executes another $IF that is always true and with an empty body. This ensures we return the proper {$ELSE, $ELSEIF} object so the method chaining doesn't break, but it doesn't actually run the conditions or bodies of the subsequent $ELSE and $ELSEIF.

Another way to look at it: we keep running conditions until one is true, then we run that $body() and start returning empty bodies down the chain until we reach the end. If all conditions leading up to an $ELSE are false, then we run the body from the $ELSE and we're done.

This only runs the conditions as needed, so something like the following won't cause any TypeErrors:

thing = undefined;
$IF(() => thing === undefined, () => "...")
.$ELSEIF(() => thing.property === true, () => "...")

Step 7: typeof

typeof is pretty easy if we have conditionals. We just need to go through all the possible return values of the native typeof:

$TYPEOF = ($obj) => (
    $type = "",
    $IF(() => $obj === $UNDEFINED, () => $type = "undefined")
    .$ELSEIF(() => $obj === $TRUE || $obj === $FALSE, () => $type = "boolean")
    .$ELSEIF(() => $obj["constructor"] === $FUNCTION("return Number")(), () => $type = "number")
    .$ELSEIF(() => $obj["constructor"] === $FUNCTION("return BigInt")(), () => $type = "bigint")
    .$ELSEIF(() => $obj["constructor"] === $FUNCTION("return String")(), () => $type = "string")
    .$ELSEIF(() => $obj["constructor"] === $FUNCTION("return Symbol")(), () => $type = "symbol")
    .$ELSEIF(() => $obj["constructor"] === $FUNCTION, () => $type = "function")
    .$ELSE(() => $type = "object"),
    $type
)

The comma operator makes a comeback so we can return our result without using return explicitly.

Step 8: Loops

We can redefine while using recursion:

$SYMBOL = $FUNCTION("return Symbol");
$BREAK = $SYMBOL("break");
$CONTINUE = $SYMBOL("continue");

$WHILE = ($condition, $body) => {
    $recurse = () => {
        $IF($condition, () => {
            $result = $body();
            $IF(() => $result !== $BREAK, () => {
                $recurse();
            });
        });
    };

    $recurse();
};

That looks like this in use:

// super contrived example
$test = () => Math.random();
$counter = 0;
$WHILE(() => $test() < 0.5, () => $counter++);

$WHILE(() => $TRUE,
    () => (
        $counter++,
        $counter >= 1000 && $BREAK
    )
);

$counter // => probably 1000, if not, go buy a lottery ticket

Then we can implement do:

$DO = ($body) => (
    $body(),
    {["$WHILE"]: ($condition) => $WHILE($condition, $body)}
);

This basically runs $body() once, then returns a chainable $WHILE loop for the rest of the iterations.

We can also implement for:

$FOR = ($start, $end, $delta, $body) => {
    $IF(() => $TYPEOF($body) === "undefined" && $TYPEOF($delta) === "function", () => {
        $body = $delta;
        $delta = 1;
    });
    $counter = $start;
    $condition = () => $FALSE;

    $IF(() => $delta >= 0, () => {
        $condition = () => $counter < $end;
    }).$ELSE(() => { 
        $condition = () => $counter > $end;
    });


    $WHILE($condition, () => {
        $recurse = ($counter) => {
            $IF(() => $delta >= 0 ? $counter < $end : $counter > $end, () => {
                $result = $body($counter);
                $IF(() => $result !== $BREAK, () => {
                    $counter += $delta;
                    $recurse($counter);
                });
            });
        }
        $recurse($start);
    });
}

This looks a bit complicated, but there's a lot of code here for managing variable number of arguments, negative deltas, etc. Usage is pretty simple:

$counter = 0;
$FOR(0, 10, 1, () => $counter++);
$FOR(0, 10, () => $counter++);
$FOR(10, 0, -1, () => $counter++);

$counter // => 30

There are some limitations. Since we're using recursion, we will eventually exceed the call stack if we run too many loop iterations.

What's missing?

Here are things I have not been able to successfully implement in Javascript without using any letters or numbers. Remember, we can run any arbitrary Javascript with Function() or eval but we don't want to have to pass around our code as one big ugly string. Technically, none of these missing pieces are "required" but it would be nice to have some of them.

async / await - we could assign functions async using Function() or eval() but I couldn't find a way to make await work, and we can't polyfill it.
return - we can un-aesthetically work around this using arrow functions and the comma operator.
throw / try / catch - I guess we can just return errors (see above) and still handle them? No stack trace though.
let / const - not required but it would be really nice!

Source Code

Here's the source code for this post, including a complete test suite. Tested in node 17.9.0 and Firefox 99.0.1, though it should work in any spec-compliant environment.

jabbascript.js