[Haskell-cafe] Compilers: Why do we need a core language?

Mon Nov 26 07:27:07 CET 2012

On 11/25/12 11:08 PM, Richard O'Keefe wrote:
>
> On 24/11/2012, at 5:26 PM, wren ng thornton wrote:
>
>> On 11/20/12 6:54 AM, citb at lavabit.com wrote:
>>> Hello,
>>>
>>> I know nothing about compilers and interpreters. I checked several
>>> books, but none of them explained why we have to translate a
>>> high-level language into a small (core) language. Is it impossible
>>> (very hard) to directly translate high-level language into machine
>>> code?
>>
>> It is possible to remove stages in the standard compilation pipeline, and doing so can speed up compilation time. For example, Perl doesn't build an abstract syntax tree (for now-outdated performance reasons), and instead compiles the source language directly into bytecode (which is then interpreted by the runtime). This is one of the reasons why Perl is (or was?) so much faster than other interpreted languages like Python etc.
>
> I have found Perl anything from the same speed as AWK (reading and writing
> lots of data with hardly any processing) to 10 times slower than AWK (with
> respect to the 'mawk' implementation of AWK).

Perhaps I was too glib in saying "other interpreted languages"; I 
certainly did not mean to include Sed and Awk (which I tend to consider 
shell scripting more than interpreted programming). Also, I'm only 
referring to the startup cost of parsing and compiling to bytecode[1], 
not any other overhead of the actual languages themselves. There are 
significant differences between the actual content of each language[2], 
but that's a linguistic issue rather than an issue of how to design and 
structure the compiler/interpreter.

[1] For one-liners and short programs, this startup cost tends to 
dominate, which is why Perl does direct to bytecode. Perl was initially 
intended to be a combination/replacement for Sed, Awk, Bash, etc; it 
only turned into a full programming language later. That said, the 
overhead for starting the Perl runtime is much higher than for Sed and 
Awk, which is why many systems engineers still use Sed/Awk in their scripts.

[2] A notorious but too poorly known example is the implementation of 
regexes where crufty old tools/languages like awk and grep vastly 
outperform modern tools like Perl and Python:

     http://swtch.com/~rsc/regexp/regexp1.html

To say nothing of the implementations of strings, hash tables, etc; 
which you mentioned.

-- 
Live well,
~wren