[Haskell-cafe] Speculation, OT: Program a Spreadsheet

Sun Nov 19 12:04:04 UTC 2017

Am 19.11.2017 um 08:05 schrieb trent shipley:
> * Is a spreadsheet you can program from the spreadsheet a reasonable goal?

I.e. use the same programming language for cells formulae and scripts?
Yes, that's very much reasonable.

> * Has it been done?

Not in any mainstream spreadsheet. Which boils down to two: Excel and 
Open/Libre Office Calc.
Excel offers VBA.
Calc is actually language-agnostic and uses URLs and XML to tie things 
together. It currently supports BeanShell (Java-without-types, it 
seems), Java, JS, Python, and OO Basic.

> So the plan is to take something like GNUmeric or LibreOffice Calc and 
> graft on a primitive function sheet interpreter.

The main point is that you'd have to replace the formular language.
I do not think that Calc was made for that.

> It would be natural to use C++,

You'd instantly kill adoption with that. Only a minimal part of the Calc 
user base is even remotely capable of coding formulae in C++, and even 
of these, a substantial fraction would be able but unwilling.

 > but the astute will note that a
> spreadsheet basically does not rewrite cells (unless you use a circular 
> reference), so I'd also like to use a functional language, maybe Haskel.

You don't want to inline the formulae of other cells anyway, because 
then the calculation will be done twice: Once to fill the referenced 
cell, and once as part of the referencing cell.

So for reference cycles, you'll rely on whatever the spreadsheet is 
already doing to deal with them.

> * Would using a functional language as a basic language of the project 
> save effort and intellectual load?

That depends on whether you're talking about the implementation language 
or the cell/macro language.

For the implementation language, you'll save the most time by using 
whatever you already know. Unless the project is going to last longer 
than, say, two years. And if you plan on getting other people to join 
the project, you'll want the language with the largest pool of 
interested and able people, which is essentially guesswork but I'd 
avoid, say, the VBA or PHP crowd ;-)

For the user-facing language, do whatever is easy to use for a 
non-programmer. Haskell should work fine, but prepare to collect 
references to tutorials, and what does *not* work fine is performance 
predictions, particularly not for nonprogrammers. This probably means 
you need strict evaluation, which means even if it looks like Haskell 
it's going to be an entirely different language.

> In the longer term I'd like as much of the spreadsheet programmable as a 
> spreadsheet to be written to run on the JVM.
> As near as I can tell near future Java and typed functional languages, 
> include the following options:
> 
> Eta,
> Frege,
> Kotlin and,
> Scala.

I don't know any of these well enough to make any recommendations.

However, for the user-facing language, you need this:
1) As easy to learn as possible.
2) Scales well to a few thousand lines of code. The learning curve must 
not have bumps along that road, because with every bump, a substantial 
fraction of the user base will be deterred from progressing to more 
complicated tasks.
3) You'll need good support for large-scale programming if you want to 
enable "just calling" into third-party modules (which would be pretty 
appealing to people who use Calc for nontrivial stuff). The JVM excels 
at this, BTW, though if you don't use Java, a substantial fraction of 
Maven modules will be awkward to use.

For (3), you'll need static typing.
For (1) - and arguably (2) and (3) as well - you need something that is 
excellent at type inference.

Type inference does not work well for updatable data structures, so you 
will want something that excels at handling immutable data. Which 
essentially rules out C++ or any other imperative language.
The type system is a real problem. For many real-world situations you 
need dependent types, but these have complicated error situations so 
newbies will usually be unable to deal with it, which means a bump in 
the learning curve. I don't know of a good way to deal with that, it's 
just on the list of things that I routinely check if somebody asks me to 
take a look at his great new language :-) (Haskell people excel at 
bending its type system to simulate all kinds of things, and I wouldn't 
be surprised if nobody had tried to achieve most dependent-type benefits 
from Haskell's type system; however, people using such a type framework 
will need to know the Haskell type system and the internals of the 
framework to make sense of any error messages that come out of a type 
bug, so this is a variant of programmer's golf, not something you want 
for newbies and learners.)

The JVM would be desirable, but typical .jar modules make heavy use of 
mutable data. So you need something that's alien to mutable data types 
but not incompatible with then; that's a relative fine line that the 
language design would have to strike.
I'd probably use a language that allows mutables but disallows aliases 
to them. Clean does this via the type system, other strategies might be 
work. However, given that arbitrary modules from the JVM ecosystem might 
throw aliased updateable references left and right, all guarantees are 
off as soon as a computation relies on data provided from a JVM module, 
so maybe it's still not worth using that. (Java folks have been thinking 
about "value types", i.e. immutables, for a while now, but I don't see 
that coming any year soon.)

Evalutation strategy is another issue.
Non-strict has some nice properties, in particular you don't need to 
differentiate between an expression and its value. However, nonstrict is 
difficult to control performance-wise, and I still see people struggle 
if they see that their code is unexpectedly slow.

> Note that a spreadsheet needs to give the satisfaction of immediate 
> results, or failing immediate results, the sensation of actively 
> working, so if the language could be interpreted that would be a huge help.

+1

> * Which combination of typed, compiled, interpreted, FOSS functional 
> language that runs on the JVM, JAVA, Haskell, C++, C, used in that order 
> of preference, makes the most sense for the Java compatible functional 
> language at the top of the preference hierarchy?

C++ and C aren't worth it if you plan to go for the JVM.
Don't know if there's a useful JVM port of Haskell.

You *can* interface with OO using XML descriptors and such, so the JVM 
isn't your only option. You could even use (or invent) a language that 
compiles to binary, with LLVM that has become a realistic option.
However, plan to invest some time into understanding LLVM.

> Note also, that I have only the equivalent of an AA degree from a CIS, 
> not a CS, perspective, so the odds are the whole idea is vaporware, 
> unless I can determine feasibility and desirability,

I suspect that no existing language fits the bill.
Given your background, you'll need somebody with language design 
experience; language design is hard because of so many conflicting goals.

 > then sweet talk
> real developers to help out.

That's a good plan :-)