Not all ideas are my own. Some are more wild than others. The wildest ones are probably mine.
(This is mostly a way for me to document my own thinking.)
Classification | What is does | Corresponding construct |
---|---|---|
Noun | Names | Cell |
Verb | States action | Function |
Adjective | Describes/limits noun | Protocol |
With its simple syntax and semantics, my hope is that this language could be a powerful tool for teaching programming to kids worldwide, true to the spirit of Smalltalk.
It should even be possible to translate into other natural languages than English, enabling people to code in their native language!
As Alan Kay points out, a pure object-oriented message-based programming language like Smalltalk resembles the Internet. While Smalltalk was a language for the personal computer, this should be a language for the global computer. I believe that such a language has to be in tune with its environment: a language of messages between autonomous entities.
If the Internet is a global machine, then the web runs on its application layer.
The lingua franca of the web is JavaScript, a language with many faults but also some interesting qualities. It has a lightweight object-oriented approach inspired by Self (prototype-based), coupled with functional programming features inspired by Scheme (immutable primitives and first-class functions). Its runtime environment is single-threaded, yet its event loop enables fast (enough) non-blocking concurrency, while being extremely error tolerant. The V8 runtime even has ties to Smalltalk, its assembler was based on the Strongtalk assembler. Smalltalk/Self and JavaScript have quite a few things in common.
In other words, this has to be a compile-to-JavaScript language (at least until WebAssembly reaches maturity).
Fun fact: Kay's presentation was done in an emulation of a Smalltalk system from the 70s running on JavaScript. JavaScript does have many "good parts", hidden beneath layers of Java like syntax, quirks and inconsistencies. But those good parts are hard to get to, especially for beginners. Thought experiment: Imagine if Java hadn't happened, IBM had continued backing Smalltalk and Netscape had chosen Smalltalk or Self in 1995, to eventually become the lingua franca of the web. This project wouldn't have been necessary.
Most object-oriented languages are highly complex, while pure OOP languages like Smalltalk and Self have almost Lisp like qualities. This language aims to bring back that simplicity of the early Smalltalks, and goes even further by leaving out classical inheritance (apologies to Nygaard). One could say it's closer to ObjectTalk than Smalltalk.
Instead of subclassing, there's cloning. Instead of interfaces, there's protocols. Inheritance becomes a question of heredity, not of bestowal of rights. It's not about ruling, but about synergy, symbiosis, cooperation. It's not about hierarchy, but about being flexible. In that spirit, cloning may be done by concatenation (mixin) or delegation (prototype).
Hierarchy – "rule of a high priest"
Anarchy – "without ruler"
Holarchy – "a whole that is part of a larger whole"
Holarchy may be a more fruitful way to view cells. Multi-agent system is another related concept. Without hierarchical subclassing, cells become autonomous interconnected entities (actors).
Extremes are not beneficial. While Clojure combines pure functional programming with managed stateful reference types, this language combines pure object-oriented programming with immutable value types and first-class functions. Its object-oriented parts remain pure.
Certain aspects of a program might be best modelled using object-oriented thinking, while other aspects best handled using functional programming principles. Structuring a project as discrete objects, interacting by sending messages containing immutable values, and internally processing those values using pure functions, might be the best of both worlds?
The fields of cells are read-only by default. Mutating fields directly is discouraged. A field points to either a cell (by reference) or an immutable value type. (Value types are autoboxed to their corresponding value cell when used as a recipient of a message.)
With all this immutability, how does one mutate state? Mutability can be implemented using something like Clojure's atoms – a reference type (cell) wrapping an immutable value type, with behaviors for replacing the current value with a new one. This does add one level of indirection, having some performance and memory penalty, but it enables the management of state in a controlled way. Reads should still be fast. As a bonus, it enables reactivity, validation and metadata.
The differentiation of objects and values as two fundamental concepts, as opposed to "everything is an object", may be helpful when reasoning about code. An object cell is a "concrete" thing with reference semantics, behaviors and internal state. Its fields can hold "abstract" data with value semantics (numbers, booleans, strings, dates, collections, etc). Although these values are in fact immutable, they may be replaced over time if wrapped in a value cell. "Concrete" objects containing "abstract" values.
The challenge lies in uniting those two worlds in a way that makes sense intuitively while enabling the expression of advanced programs, with state managed safely over time.
Because time is of the essence.
The idea is to combine Rich Hickey's approach to identity and state with Bret Victor's Inventing on Principle, using cells. If mutability is managed with atoms, and all mutation of state is the swapping of immutable values by messaging, the history of changes may be (globally) recorded, rewound, paused, replayed, stepped through and inspected. Transactions may also be possible?
When all mutation of state is done by messaging, reactivity is a short step away. Maybe even as reactive as spreadsheets?
Must adher to Alan Kay's Spreadsheet Value Rule (the word "cell" replaced with the word "field"):
- A field's value relies solely on the formula the user has typed into the field
- The formula may rely on the value of other fields, but those fields are likewise restricted to user-entered data or formulas
- There are no "side effects" to calculating a formula: the only output is to display the calculated result inside its occupying field
- There is no natural mechanism for permanently modifying the contents of a field unless the user manually modifies the field's contents
- In the context of programming languages, this yields a limited form of first-order functional programming
#4 may have to be reworded for this to be applicable to state in a programming language.
Data flow programming goes back to Larry Tesler's 1968 language Compel. With the enormous success of spreadsheets (#1 programming environment), it's a wonder so few programming languages have caught on to its ideas.
The code and its AST is essentially a tree of cells within cells, reminiscent of Lisp's lists.
This enables some interesting ideas:
- Introspection and reflection
- Easy traversal for introspection, analysis, refactoring, visualization, diffing, history/persistence, reflection, macros, etc
- Always-running environment like Smalltalk's image-based VM, but only during development?
- Metadata on the cell level
- Extension of semantics
- Tags, documentation, links, etc
- Learnable programming (Bret Victor)
- The message slot syntax (
()
) was chosen because it facilitates some of Bret Victor's powerful ideas (see below) - An IDE may concretize/visualize cells, enabling inspection of their state and direct manipulation while running
- The tree could facilitate adding time as a factor, with time travel debugging
- The message slot syntax (
This could be implemented as a more low-level intermediate representation, like in Bosque, without any particular target language/system in mind. In any case, the focus should be on enabling a better developer experience.
Reflection should be allowed within the confines of something like a revocable proxy object, implementing the object-capability model of computer security. Capabilities should probably be much more restricted for "live" code than for the developer during development. A cell should at least be able to self reflect
(finally the term makes sense).
Files should not be the concern of the developer. The IDE should abstract away files and folders, allowing developers to focus on their mental model of the project. This doesn't have to be image-based like Smalltalk. For interoperability and version control, it should still use files and folders under the hood, matching the inherent structure of the project.
The current state of programming is full of distractions, taking away focus from what this artform is really all about: designing, building, thinking, exploring. This is especially true for beginners, who are faced with a number of hurdles that first have to be overcome before even being able to write a line of code. Anyone should be able to jump right into a project and immediately write a line of code and see its result. To install an IDE and open a project is admittedly also a hurdle, but it's a much smaller one than the status quo of programming languages. It could even be built into the web browser. If not simply running as a web app. UNIX skills should not be a requirement for programming.
Documentation should be an integral part of the language. Not only in the shape of comments, but code itself should be self-documenting.
Like Smalltalk, code should (be able to) be always running (at least during development).
- Intelligent code completion
- Message signatures with typed slots (and defaults) can take IntelliSense to the next level
- Edit-in-place while running
- Breakpoints, with support for conditions
- Time travel, with intuitive navigation controls and visualization of changes
- Similar to a browser (previous, next, reload)
- Controls for run/pause, and which level to operate on (expression or breakpoint)
Edit: Don't read the list below. Instead check out Glamorous Toolkit!
- Outline of the project's structure, with filters/search
- Internal tab for navigating the project's own modules and cells
- External tab for navigating and managing external modules (dependencies)
- Allowing different views of the project's structure (by hierarchy, tags, etc)
- Could work as a menu: click a cell/method to insert it at the cursor
- State for watching (or visualizing!) the state of specified cells
- Allowing different views into the project's state
- By default showing the active/selected cell
- Terminal for the direct input of messages and output of their return value (REPL)
- Full introspection and reflection capabilities (the developer is granted "God mode")
- Could be just an input field at the bottom, with the result shown as an overlay just above it
- History can be navigated by up/down keys, showing the result for each message in the overlay
- Log for the output of log and error messages (separate from the terminal), with filters/search
- Network for inspecting network activity, with filters/search
- Stack for the current stack trace when paused
- Profile for performance profiling of code
- Cards for interacting with the project, producing visual test cases and examples?
Speaking of trees, it makes a lot of sense to think about apps/servers as multicellular organisms communicating through soil/air/water. Trees in forests communicate by sending chemical signals (messages) and nutrients (data) over fungal networks, as well as pheromones through the air. Similarly, apps/servers need fungi/air/water to communicate with each other, either over short (inter-process communication) or long (networks) distances. The Internet already works (and looks) a lot like fungi. But I'm sure we could do even better.
One characteristic that plants and apps have in common is that they are immobile, in contrast to animals. Apps running on a mobile device is like taking potted plants for a walk. Plant cells may be a more suitable model for computing than animal cells, at least for now.
Taking inspiration from (my limited understanding of) molecular biology to the extreme:
- Cells are like plant stem cells
- Every cell has a nucleus with DNA (encoded information) with restricted access from outside the nucleus
- They are isolated from their environment, protected by a membrane
- They act on messages in their environment, picked up through receptors
- They emit messages into their environment
- Messages have a signature, but no address, and may be picked up by any matching receptor
- Always adapting to stimuli, the DNA in a cell may mutate over time, but most change is epigenetic
- Cells build ever-evolving organisms of differentiated cells, based on their ever-evolving DNA
- The (encoded/encrypted) DNA holds the "recipe" for an entire organism
- Program state as the emerging phenomenon of conciousness?
Good luck implementing that!
Not "everything is an object", but JavaScript's "almost everything is an object".
In this perspective, the module is the cell. As in biological cells, there can be various subcellular components, including endosymbiotic "cells within cells" (Mitochondrion, Plastids) and other organelles. These subcellular cells are lighter and simpler than the module, but share the same general features (encapsulation, receptors/behaviors, etc).
That makes it 4 levels to reason about:
Runtime
– EnvironmentModule
– Eukaryotic cellCell
– Endosymbiotic cellValue
– Proteins and other biomolecules
Don't tell FP developers, but one could view functions as bacteria. Or mitochondria, to be precise. "Mitochondria are typically round to oval in shape and range in size from 0.5 to 10 μm" – sounds about right. "Mitochondria are often referred to as the powerhouses of the cell. They help turn the energy we take from food into energy that the cell can use."
Viewing a module as "The Cell" makes sense. Modules/cells are the building blocks of the app/organism. Specialized, differentiated, loosely coupled. Inside you'll find a tiny contained universe of smaller structures, functionality and data that can be reasoned about in isolation.
Instead of the cell's local state being its nucleus, perhaps a better analogy is the database (persisted state)? Of course, each cell doesn't have its own database, but the biology-computing analogy only goes so far. The size of the human genome is ~700 MB.
Numbers should support units (such as cm
, ms
, kg
, percent
) and easy conversion between compatible units (for example 10 inches as cm
).
Smalltalk is known for its exact number representation. This will have to be handled by libraries, as JavaScript's Number
has "arbitrary imprecision".
If a SmallInteger operation leaves its range, the result becomes a LargeInteger.
The JavaScript number magic equivalent would be Number
-> BigInt
if the number is an integer.