[Haskell-cafe] Re: DDC compiler and effects; better than Haskell?

Sun Aug 16 10:47:48 EDT 2009

"John A. De Goes" <john at n-brain.net> writes:

> I forgot about links. In that case, consider:  getUniqueFilesInDirRecursive.
>
> Attacking irrelevant details in an argument is often called a  "strawman attack". Such attacks are
> pointless because they do not  address the real substance of the issue. My example is easily
> modified  to avoid the issues you raise.
>
> Consider the fact that many file-based operations _can and are  parallelized manually by
> developers_. The challenge for next  generation language and effect system designers is to figure
> out _how_  such operations can be automatically parallelized, given sufficient  constraints,
> high-level constructs, and a powerful effect system.
>
> Saying, "I don't know exactly how it will look," is quite a bit  different from saying "It can't
> be done." I claim the former.
>
> Regards,
I am sorry, but it's not about details, but about the essence. My point was that there are a lot
of subtle issues when we're dealing with (e.g.) a file system in a real-world manner.  
I have no doubt that it is possible to develop a sound logical system that would cover them and
then encode it as a part of the type system of a language. That will probably lead to 
compile-time detection of a wider class of errors. But the problem is that one (IMHO) cannot 
deal with these subtleties in a generic manner. That is, there are things that may be done in
parallel, and there are things that may not -- and it depends on the actual task you want to
perform. So basically instead of manually parallelizing something, you'll (IMHO) end up writing
complex type annotations so that a compiler could parallelize it on its own behalf. 
As somebody who have a certain experience with formal methods, I doubt that the latter will ever
be simplier than the former.

> John A. De Goes
> N-Brain, Inc.
> The Evolution of Collaboration
>
> http://www.n-brain.net    |    877-376-2724 x 101
>
> On Aug 16, 2009, at 12:38 AM, Artem V. Andreev wrote:
>
>> "John A. De Goes" <john at n-brain.net> writes:
>>
>>> On Aug 15, 2009, at 6:36 AM, Jason Dusek wrote:
>>>> 2009/08/14 John A. De Goes <john at n-brain.net>:
>>>>> Hmmm, my point (perhaps I wasn't clear), is that different
>>>>> effects have different commutability properties. In the case
>>>>> of a file system, you can commute two sequential reads from
>>>>> two different files.
>>>>
>>>> I think this is a bad example -- it's not something that's
>>>> safe in general and discredits your idea. How would the
>>>> compiler even know that two files are not actually the same
>>>> file?
>>>
>>> I don't think the file system is the best example. However, I do  think  it's a reasonable one.
>>>
>>> Let's say the type of the function getFilesInDir is annotated in  such  a way as to tell the
>>> effect
>>> system that every file in the returned  array is unique. Further,  let's say the type of the
>>> function  makeNewTempFile is annotated in such a way as to tell the  effect  system that the
>>> function will succeed in creating a new temp file with  a name  unique from any other existing
>>> file.
>> Sorry, but this example is ridiculuous. While file *names* in this  case might be reasonably
>> assumed
>> to be unique, the *files* themselves may not. Any modern filesystem  does support file aliasing,
>> and usually several forms thereof. And what does makeNewTempFile  function do? Does it create a
>> new
>> file like POSIX mktemp() and return its name, or does it rather  behave as POSIX mkstemp()?
>> The first case is a well known security hole, and the second case  does not, as it seems to me,
>> fit
>> well into the rest of your reasoning.
>>
>> However, let's consider further file system tree traversal. In some  cases you might not care,
>> whether
>> some of the directories you descend into are actually the same  directory, so your proposed
>> optimization
>> would be `safe'. However, in other cases sequential traversal would  work, while a parallelized
>> version
>> would not, unless special additional measures are taken. E.g.  consider a case of a build
>> system. It
>> traverses a source tree, finds sources files and if corresponding  object files are non-existent
>> or
>> outdated, does something to regenerate them. Now if you have a  directory that's actually a link
>> to
>> another directory, and you do sequential traversal, everything is  fine: you descend into the
>> directory
>> the first time, build everything there and when you descend into it  the second time, there's
>> just nothing
>> to do. If you do parallel traversal, you may well end up in the  situation where two threads
>> check
>> simultaneously for an object file, discover it's outdated and run  two build processes
>> simultaneously,
>> with the most likely effect of corrupted object file.
>>
>>
>>> Then if you write a recursive function that loops through all files  in  a directory, and for
>>> each
>>> file, it parses and compiles the file into a  new temp file, then a  sufficiently sophisticated
>>> compiler should be  able to safely transform the recursion into  parallel parsing and
>>> compilation
>>> -- in a way that's provably correct, assuming the original  program  was correct.
>>>
>>> The promise of a language with a purely functional part and a  powerful  effect system for
>>> everything else is very great. And very important in  the massively  concurrent world we are
>>> entering.
>>>
>>>> Well, yes -- which sounds like, there are no guarantees
>>>> in general. Something that works half the time leaves you with
>>>> two responsibilities -- the old responsibility of the work you
>>>> did when you didn't have it and the new responsibility of
>>>> knowing when it applies and when it doesn't.
>>>
>>> In the other thread, I brought up the example of buffering reads.   Library authors make the
>>> decision to buffer for one reason: because if  some other program  is messing with the data,
>>> you're
>>> screwed no matter  what.
>>>
>>> And yeah, "they might be screwing with the data in just the way  you  need it to be screwed
>>> with,"
>>> (Sebastian), in which case my advice is  use C and hope for the  best. :-)
>>>
>>> Regards,
>>>
>>> John A. De Goes
>>> N-Brain, Inc.
>>> The Evolution of Collaboration
>>>
>>> http://www.n-brain.net    |    877-376-2724 x 101
>>>
>>>
>>> _______________________________________________
>>> Haskell-Cafe mailing list
>>> Haskell-Cafe at haskell.org
>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>>
>>
>> -- 
>>
>> 					S. Y. A(R). A.
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>

-- 

					S. Y. A(R). A.