Spec

Badlang is an experiment. It is different. Its the product of wondering what a minimal language built around symmetric meta-programming would be like. The design decisions are made to match my curiosity and personal tastes, as well as my implementation skill set. If you don't like it, make your own bad language: this is my badlang.

Badlang hopefully should answer the question of what can you do with flexible symmetric meta programming combined with compile time execution. I think there is a lot of potential for neat stuff.

I want a language with more power to customize (and thus confuse) than C++, I want a language with more flexible type system and look-ups than python, but to fail faster (via static typing). I want a language that can run fast. I want a language that can be its own build script, and can extend its own compiler, all as libraries. Perhaps you think I'm crazy for this (As I do), but it should be fun to try to make it; badlang is my attempt.

Badlang is in progress and under design: it is not at all usable, or finalized. It won't be at all usable anytime soon. If you want a language to use, go try something like Go, Haskell, Python, Rust or even C++. Or maybe make your own.

Basics

Badlang is a compiled, strongly typed, statically typed procedural programming language. Well, there may be some debate on it "statically typed" is accurate, but at the very least, it is explicitly typed.

Feature wise, Badlang is focused on meta programming. All features, when practical, are left to be implemented in libraries via meta programming rather than in the language itself.

To enable this, bandlang supports runtime Type generation: Code can create new Types. Libraries can use this to implement things like generics.

To use this effectively, Types are variables. You may have a a variable A of type Type, then declare a variable of Type A.

Badlang also relies heavily on dynamic look-ups for attributes. Callables also have a special method that can resolve them based on the types passed which allows implementing overloading and generic functions. This, along with Type variables also allows easy runtime reflection.

To run this in a performant manner, compile time function execution will allow propagating constants, which will include the Types in most cases. This means the dynamic look-up functions will usually be called at compile time, saving the runtime overhead.

Types

A type describes a kind of object. These "types" are of type Type. The type Type is special, and predefined (It fills the role of type in python).

Type is the type of all types. Thus Type is the only meta-Type allowed. Meta-type like behavior can be gotten by making functions that return customized instances of Type.

A instance of Type is internally represented as a pointer to a TypeInfo. A special TypeInfo value is provided by the compiler, which Type itself is a pointer to.

The underlying TypeInfo for a type is hidden from the application, and is immutable. A new type can is created by providing a TypeInfo, which is copied, and a pointer to the new immutable copy is returned, as a new Type (Instance of Type). This functionality will like be provided as a method on TypeInfo.

There are no "Static" methods or fields (In the Java sense), but its easy to allocate some data, and associate it with a type when you create the type to get a similar effect.

A TypeInfo instance internally is a struct (Badlang's structs are defined in terms of Type, so we can'y use them here) with several fields, all of which are function pointers. The set of functions includes is exactly determined by the set of things the compiler/runtime may possibly need to know about the type. This exact set, and the signatures of the functions is a work in progress, but here is a rough guess, as a C++ struct.

These values are exposed via methods of Type Type, though in a read only manner (returned by value, so a copy is returned)

Example

Suppose you have a Type MyType, and and instance of that type myInstance.

myInstance is an instance of MyType

MyType is an instance of Type

Type is an instance of Type

Consider the code:

// Make an instance of MyType called myInstance
var myInstance MyType = whatever
// access a field using the "." operator
x:=myInstance.SomeMethod()

The bottom line compiles to (approximately):

// Resolve SomeMethod
v vType:=MyType.Dot(MyType,myInstance,"SomeMethod")
// Resolve what function to call based on Type and argument Types
theFunc:=vType.Call(vType,0,NULL)
// Check that theFunc is a function that accepts the correct types of args
assert(theFunc.argTypes==[MyType])
// Call the function
x:=theFunc(v)

Note that in this case, assuming MyType is known at compile time, everything but the very last line can also be computed at compile time. Thus the entire process is reduced to a single function call. This is then exposed to the compiler's optimizer, and could even be inlined.

Special Types

  • Type: Can be used as the type of a variable, of function parameter. Implemented as a pointer to an immutable TypeInfo.
  • TypeInfo: struct containing the information needed about a type (see above)
  • Functions: Not actually a single type. There is one type for each function signature. There are special literals for creating functions ({args|statements}), and function types ({args})
  • Void: Not actually special, just happens to do nothing, and has size 0. This is the type of return values when nothing is returned.
  • Pointers: Again, not just one type, but many. Any 2 pointers to the same type will not have the unique field set, but are otherwise identical, and thus be considered equivalent types. Its possible to implement additional pointer like types, but the built in ones are the ones produced by the & operator for the built in types.

Subtypes

Badlang does not have any subtyping. There is no type based polymorphism. No Covariance and no contravariance. It is like Go in this respect.

However, it is possible to create interfaces (Go style), as well as inherit attributes (like embedding in Go). These features are not part of the language: they can be implemented using it. Implementations of these techniques will likely be included in the standard library.

Interestingly enough, interfaces pretty much identical (in implementation, and use) to Go could be implemented (you can write code to dynamically check for and look-up the needed methods, and build the itable on first use like Go does, then pack the value or a pointer to it in a struct along with the pointer to the itable).

Embedding would simply be implemented by forwarding attribute look-up attempts to the embedded type when they aren't found on the parent. This can be done trivially in the Type's Dot function.

Equivalence

Two values are equivalent iff it substituting one for the other has no specified effect in all possible usages. This is symmetric. Note that since equivalence is observable BadLang, this definition is self referential.

This means that outside of undefined behavior, its impossible to distinguish 2 equivalent values. This means (among other things) that it de-duplicating them (using just one and deleting the other) would be a legal optimization.

This generally applies to things like SSA values which are non-addressable and immutable. It is important to note the implication of mutability: if mutating one does not mutate the other, they can not be equivalent.

Loop-invariant code motion is an example of an optimization that de-duplicates such values from each run through a loop.

In BadLang, equivalence is defined very carefully for instances of Type. An instance of Type A can be used as an instance of Type B iff Type A is equivalent to Type B.

If you define a type for error code enumeration values in one place, and another such type in a different place, they can be equivalent, or not, depending on how they are made.

The same is true for generic collections: you can make a function that is passed in a type, and returns a list of that type. If two equivalent types are provided, its possible for it to return 2 list types that are equivalent or not depending on the function. In the case where they are equivalent, it would be legal (but not required) for the compiler to de-duplicate them.

The implications of this are quite significant: to compare types for equivalence requires defining equivalence on types. As part of this, BadLang has to compare the fields of TypeInfo, which contain functions which can be closures over arbitrary state. This means equivalence must be defined for all types, not just Type.

Functions

Any Type can be made callable with any combination of argument types. These are referred to as Callable Types, and their instances are Callables. These are NOT functions (When provided with a set of arg types, they return a function though)

A function in Badlang is an instance of the Function Type, and its instance data dictates its exact call signature (argument types, and return type).

The Function type also happens to be callable (Its Call function returns itself), but thats just a convenience thing, since what really matters is that the compiler can generate an actual call for instance of the Function type.

Type Arguments

Since Types are simply instances of Type, they can be passed as arguments. There is one special thing though: arguments to a function can have a type that is a previous argument.

Here is a function, in Go syntax, that does this:

func someFunction(A Type, B A) A

A is some Type, and B is an instance of that Type. This function also returns a value of Type A.

Here is an approximation of the same thing in Java using generics:

class Blah<T> {
    T someFunction(T B){
    // ...
    }
}

This is particularly interesting, since it shows (part of how) Badlang can do Java style generics, without and extra features, or even libraries. It just works.

A more useful example:

func new(A Type) (Aptr Type, Aptr)

This has 2 return values: A Type (Aptr) and a value of type Aptr.

It might be cleaner (and faster) to pass in the pointer type, and return a value of that type instead:

func new(Aptr  Type) Aptr

Though this loses the ability to create special pointer types, such as one only valid for the current thread, from a special memory pool etc.

An even stranger construct, which has the benefits of both, is

func new(A Type) (Aptr Type,func()(Aptr))
 
_, newInt := new(int)
var i *int = newInt()

With this version, if the Type is know at compile time (here its int), the newInt function can be produced at compile time (assuming new is pure). Theoretically, it can be reduced to a malloc of a constant size.

This idea of splitting up functions via making functions that return functions is key: it can allow part of a function to be evaluated at compile time. While it is possible for an optimizer to accomplish this though some fancy tricks, its better to be explicit about it, since realistically the optimizer won't be great at it.

Syntax

The syntax is very much a work in progress, and likely to change drastically. This is simply a home for the current plans.

# A comment
 
# A declaration of type T.
var T t=blah
 
# A declaration, inferring the type from the expression on the right
t2:=t
t==t2 # True
 
# Functions:
f:={|} # Do nothing, no args, no return value
f:={Int a > Int | a} # Takes an Int, returns and Int
add:={Int a , Int b> Int | a+b} # Takes 2 Ints, returns their sum.
add:={Int a, b> Int | a+b} # Same as above, but less verbose (Go style)
 
# A complex, but not very useful function:
foo:={Type A, B, A a, B b > Type C, C |
    return a+b
}
# And some uses of foo:
X, x:=foo(Int, Int, 2, 3)
X, X x:=foo(Int, Int, 2, 3)
_, Int x:=foo(Int, Int, 2, 3)
var _,Float y=foo(Float, Float, 2.0, 3.0)
# It would be pretty easy to make a callable wrappers for
# functions like foo so they can be be called without the explicit type parameters.
 
# Some code
sum:={Int i > Int|
    if(i>0,{> Int|
        i+sum(i-1)
    },{> Int|
        0
    })
}
abs:={Int i > Int|
    if(i>0,{> Int|
        i
    },{> Int|
        -i
    })
}

There are a few choice for how to do returns. A current idea to do what Rust does. The last expression is returned implicitly (and some other details). Explicit returns are complicated (or less useful) by using closures and functions for flow control.

# Here is sum again, but with an optional return type specified
sum:={i Int > Int |
    if(i>0,{|i+sum(i-1)},{|0})
}

Note that this closure/function (sum) can refer to itself. For better or worse, the scope sum is declared in is filled in before any closures in it bind their variables. This can make shadowing a bit odd at times, but is intended to remove the need for forward declaration.

# Here is a literal for a function type.
# F is the Type of functions that take an Int and return an Int
F:={i Int > Int}
# Here is an instance of such a function
var f F={i Int > Int | i}
# This function infers its return type
a:={i Int | i}
# And this one is a wrapper around it that specifies the return type.
# This amounts to a runtime type assertion,
# but in this case, only constants are needed to compute the return type of a
# and thus the constant propagation in the compiler will generate then
# execute this type assertion code at compile time
b:={i Int > Int | a(i)}

Functions that have a return type in their signature are not needed for the language, but its an interesting idea, and is being considered for inclusion.

Callable that act as wrappers to assert return types are possible. Callables that take in paramaters of any types, and expand them to explicit type paramaters are also a possible wrapper.

A short hand for Type paramaters like:

var x A;
# Full
f:={A Type, b A| b }
f(A,x)
# Shorthand
f2:={b !A| b}
f2(x)

could be useful. Its just syntactic sugar. I dislike adding features, but this could be added if it would save a lot of typing and help clarity.

Similar shorthands could be implemented for default paramater values, which can also be implemented via callable wrappers.

The decision to included such extensions will only be made it they are significantly better than libraries can provide in readability, debug-ability and maintainability.

Misc

Maybe immutability by default (Like Rust), since optimizations are way simpler if functions and Types are immutable usually, and the mutable cases simply be unoptimized.

There must be some exception throwing system that is marked as ok to throw at compile time (asserts basically). There should also be a kind that can only be thrown at compile time (any code that can throw them at runtime is rejected by the compiler). This can be used for things that are supposed to be done at compile time (loading other source files, generating library bindings, importing resource files, some type checks etc)

Bootstrapping

When the compiler initially starts up (as is the case when you run it on a build script, at the beginning of the script), there is a minimal set of predeclared identifiers implemented in the compiler. These include the basics needed to use the language and some way get at the standard library (import?).

The build script can then configure an necessary restrictions (such as setting up compile time file system access limitations, import paths, etc), and then invoke compilation of the application itself. This will compile it in memory: the build script is responsible for making the necessary calls to either write it out to disk (and then maybe link it), or run it directly.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License