[Haskell-cafe] data analysis question
Richard A. O'Keefe
ok at cs.otago.ac.nz
Thu Nov 13 06:05:00 UTC 2014
On 13/11/2014, at 3:52 pm, Brandon Allbery <allbery.b at gmail.com> wrote:
>
> It is an open source implementation of S ( http://en.wikipedia.org/wiki/S_(programming_language) ) which was developed specifically for statistical applications. I would wonder how much of *that* was shaped by Fortran statistical packages….
The prehistoric version of S *was* a Fortran statistical package.
While the inventors of S were familiar with GLIM, GENSTAT, SPSS, SAS, BMDP, MINITAB, &c.
they _were_ at Bell Labs, and so the language looks a lot like C.
Indeed, several aspects of S were shaped by UNIX, in particular the way S (but not R)
treats the current directory as an “outer block”.
Many (even new) R packages are wrappers around Fortran code.
However, that has had almost no influence on the language itself.
In particular:
- arrays are immutable
> (v <- 1:5)
> w <- v
> w[3] <- 33
> w
[1] 1 2 33 4 5
> v
[1] 1 2 3 4 5
- functions are first class values and higher
order functions are commonplace
- function arguments are evaluated lazily
- good style does *NOT* “traverse arrays by indexes”
but operates on whole arrays in APL/Fortran 90 style.
For example, you do not do
for (i in 1:m) for (j in 1:n) r[i,j] <- f(v[i], w[j])
but
r <- outer(v, w, f)
If you _do_ “express data transformations and queries
functionally in R” — which I repeat is native good style —
it will perform well; if you “traverse arrays by indexes”
you will wish you hadn’t. This is not something that
Fortran 66 or Fortran 77 would have taught anyone.
Let me put it this way: R is about as close to a functional
language as you can get without actually being one.
(The implementors of R consciously adopted implementation
techniques from Scheme.)
More information about the Haskell-Cafe
mailing list