badlang: A bad programming language

Included page "clone:badlang" does not exist (create it now)

Safety - 25 Feb 2012 07:13

Tags:

Type and memory safety are very important. They are handled well in go, so we will steal that design, but throw in some C++ like layered complexity for good measure.

Go's model of having complete type and memory safety, unless you explicty use unsafe is very nice. How can this be generalized in badlang style?

The Goal: Opt-in danger

Rather than opt-in safety, we want opt-in danger. This means that type and memory safe code is identifiable, the default, and can be required simply by disabling the ability to opt in to danger. This comes in 2 flavors: danger provided by the language itself, and danger provided by other modules. Since a goal of badlang it to be simple, the danger thats provided by the language itself can actually be provided via some special core modules, and access to them can be controlled the same as all other modules. That means to opt-ion to dangers, you simply need to import a module that provides those dangers (again, like the unsafe package in Go).

It is important to note that dangers to not get inherited by using dangers. You can use something that lacks memory safety to implement something that is (claims to be) memory safe. Otherwise nothing safe using the lower level tools!

Generalize!

Now we got the safety system down to which modules you can/do import. There are several types of "dangers", so modules can simply register themselves with a list of danger types their public APIs expose. When importing a module, the compiler is provided with a set of modules the module (and any sub-modules it causes to get imported) can access. To restrict access to various types of dangers, the set of modules can simply be filtered.

Some use cases:

  • generally keeping most code free of memory/type errors (and free from having to check for them when debugging!)
  • ability to sandbox code (prevent direct memory access, file system access, syscalls etc)
  • Allowing, but discouraging widespread use of some low level basics which are needed to implement basic language features in standard lib.
  • enforcing clean plugin and other API boundaries inside a project
  • allowing explicit (and easy to find) safety violations where required, such as interfacing with other languages, and some optimizations.
  • enforcing arbitrary project level usage decisions (like preventing some part of the project from using threading, synchronous IO, exceptions, or other random things like that)

Source of Safety

So how exactly can a type and memory safe language be implemented in this system? The basic idea it to stick with the same design as Go, but allocation can be done by functions provided by libraries which internally use operations that are not memory and type safe. Basically a clone of Go in this respect, but new and make come from module in standard lib, not the language itself. Like Go, direct memory access would come through some unsafe types provided by another module. It works for Go, so theres no reason not to steal/copy it. - Comments: 0

Compile Time is Runtime - 16 Feb 2012 07:54

Tags:

One of the key aspects of badlang is the build system. Aside from having a rather bad[lang] bootstrapping problem, its quite novel.

First, a runtime/interpreter is needed. All it needs is the ability to run badlang code. While it may lack the ability to directly generate executables, we shall still call it the compiler.

The compiler needs to be able to run a single build script file written in badlang. This script can do any number of things, but the general process would be as follows:

  1. import compiler modules (specially made available by the compiler)
  2. use compiler modules to generate a module object from the main source file
  3. pass a pointer to the main function of the main module to a function from the compiler modules that builds an executable from it.

This last step is a bit complex. For it to work, badlang needs have a moving garbage collector. The idea is the pointer to the main function is the start point, and it and everything it references is copied into a block of memory, much like a moving garbage collector would do. This is then simply dumped to disk as an executable with all the needed headers and such. A little extra bit of care can be taken to properly separate code and data, and recompile all the code from its LLVM IR for the desired target with high optimization if desired.

The idea is that each module can be "run" to produce a module object. This resulting module can be saved if doing incremental compiles. An import call literally invokes the compiler to go compile (if needed) the module, and return the resulting module object.

This means that the code at the top level in modules is meta-programming. Code to create functions, types, generate modules and more can exist there, and it can even call arbitrary functions. This top level code returns a module object.

As a side note, it is possible to make a badlang program run rather than compile by launching a file with the compiler. Doing this would basically resemble a python program: it would dynamically load and compile all the imported modules, generated top level functions and classes at import time, and could start execution from some entry point in the top level of the main file.

This architecture has some impressive benefits. The compiler functionality is presented to the language, allowing dynamic compiling and loading of code if desired. It allows badlang itself to be used as a build script, and meta-language. It allows arbitrary compile time actions to be performed with no special tools or configuration.

Some things that this makes possible:

  • One could write a function that could be passed a path to some non badlang source file, and return a module that provides binding for it. This could be done with no extra build scripts, tools or configuration. It could even go so far as to invoke the compiler for said language and statically link in the resulting library, though that would require some more features, and config to know what compiler to call and how. But this all can be done my a user provided module, and needs no direct language or tool extensions.
  • Compiling in data files. If desired, file dependencies, say 3D models for a game, could be loaded at compile time (and properly included in incremental compiles).

In addition to many preprocessing and build script type tasks, the meta-programming and compile time execution allows badlang itself to replace many compiler features as well:

  • Enumerations can be implemented with regular code. There is no need for special constructs like Go's Iota.
  • Field and method lookups, and some other type details can be implemented (and customized) in badlang rather than in the compiler. This allows even the concept of subclassing or structs to be a library level, not a language level feature.

To get some of these details (like customized types) to work with good performance, an additional feature is actually needed: compile time execution of deterministic expressions. This means a rather extended version of constant folding. Things like looking up field locations in structs are normally done at compile time, but instead in badlang are implemented with an function that is runtime equivalent to the traditional compile time lookup. This however, can still be "run" when compiling (since it only depends on known values and has no side effects), which reduces it to the expected offset, to get the expected performance.

Having the type system to be simple to implement, but this extensible, without incurring performance overhead, will be a key feature for badlang it it ever works!

Garbage Collection

One other benefit of having compile time be runtime, is that the runtime garbage collector can be used to remove inaccessible code and data. More specifically, a moving Tracing garbage collector can start a traversal at the entry point function (main), and copy/move all needed data into a contiguous region of memory, or even directly into the output binary.

If sufficient care is taken when designing how code and types are represented in memory, this could even strip out reflection data for types that are never reflected on. In programs that end up containing some places that still need dynamic code generation, they will end up including some of the compiler modules, where projects that are fully statically compiled will omit them. - Comments: 0

Why badlang? - 24 Jan 2012 08:57

Tags:

So you think language X is is bad? Join the club! And since its bad, we need a different language right? I think all programming languages are bad, and we need yet another bad programming language: enter badlang!

Badlang is in the design phase. Its designed to be statically dynamic through eagerly evaluating buzzwords (here buzzwords is defined to be a large set of buzzwords, or just the noun "buzzwords", you choose!).

Inspiration

Go lacks generics. How can generics be added without forcing compile or runtime size or time overheads? How about a completely unrelated language that is entirely designed around solving this one issue? Its badlang!

What if all the different flavors of generics, with their different tradeoffs all could be implemented in code, with no extensions, in a language? Such a language would allow worse hacks than C++. Surely it would be a badlang.

How can this be done? Dynamic runtime type creation seems to solve the compile time size/time bloats. Doing this same dynamic creation, but pre-evaluating it at compile time seems to save the runtime slowdown. Lots of other things can also be built with these tools too.

This focus is the core of badlang: make everything dynamic, then allow it to run at compile time to [re]move the overhead.

What does this lead to?

Design goals:

  1. simple -> most features implemented in stdlib
  2. self-metalanguage -> meta-programming is cool, but if your meta-language is not the target language, you can't have meta-[meta-[meta-[meta[…]]]]-programming. Also, less languages if you reuse badlang as the metalang.
  3. usable performant code-> if you want to waste time optimizing, make it possible to go as far as desired making horrible code to make it run fast, but allow such messes to have tolerable APIs.

In short thats badlang! Now for some bad examples:

printf(someStringConstant,[some args or not])

how about this:

printf(someStringConstant).([some args or not])

Explanation:
Here printf takes an immutable string. If the string is known at compile time, the printf function gets called then, and it returns a callable that knows how many and what type of arguments it expects. Then the compiler sees there is a callable getting called, so it enumerates the types passed to the callable (the types of [some args or not]), and passes the types to the callable's function lookup that returns a matching function, but can also throw an error if the types don't match what is expected, and it can do this at compile time if the types and format string are all possible to determine. - Comments: 0


Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License