Monday, August 20, 2018

(Very slowly) turning the entire English language into computer code

TL;DR;  I'm very slowly working on defining all of the English language in code.  Here is the link.

So when I was a kid, I thought dictionaries were interesting.  You define a word, and that's defined in terms of other words, which are defined in terms of other words, on and on.  What, I thought to myself, if you could expand a word out and see what it's made of down to the lowest levels?  What is the smallest set of words you need to define other words?  An issue is that dictionaries are often circular.  Life is defined in terms of death, death in terms of life.

Well I really was happy much later in life to discover later in life that I was not the first to think along these lines.  Enter the Natural Semantic Metalanguage.  This is supposedly a list of around 60 words from which any word can be defined down to.  See the link for words, words like "do" and "happen" and "part."  Supposedly?  Did I say supposedly?  Turns out people on a website called Learn These Words First actually made a non-circular dictionary.  They boiled down the definition of 2000 words, in layers, down to the 60.  Meanwhile they selected the 2000 words from an English learner's dictionary that depends on only the 2000 words that the LTWF boiled down further.

This is all super great.  However, I thought to myself, English is such an imprecise language.  What if I could make the definitions look more well defined?  I decided to define them like one would define computer code, very specifically.  I am boiling down the definitions to something similar to a symbolic logic system called predicate logic where one might say something like "all dogs have tails" as "ForAll(x, dog(x) Implies Exists(y, tail(y) And has(x, y)))."  Basically you are making statements about what kinds of things exist, what kinds of things all things have, what things imply other things, things like that.  It's a language with a pretty small number of words in the end, if you can boil down further words like "dog" and "tail" and "has."

So bad news first, I'm nowhere near finishing the project, and it needs work even as it is.  Good news, I've started it, and I feel like sharing it because I just want people to know I'm working on it.  I'm loosely following the LTFW non-circular dictionary posted above (it has been super helpful.)

The code doesn't really do anything right now besides definitions, but I think the definitions are fun to look at.  At some point I do want to make it so you can expand definitions out by clicking or something, or perhaps even make it so that code you write like on(house, hill) would actually draw a generic house on the screen or something.  Some day I might try swapping English with an already more well defined language like Lojban.

I've posted it on GitHub if anyone wants to make suggestions.  Here is a link where you can easily view the code.  I wrote it in Mathematica.  I might at some point convert it to Python or Haskell or even maybe Prolog if that makes things easier for people.  You can use Mathematica online for free now, also someone's working on a free Python interpreter of some of the language and someone else a Haskell version.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home