[Haskell-cafe] How to safely and fast write repa array to a file?
Compl Yue
compl.yue at icloud.com
Sat Apr 4 14:08:28 UTC 2020
I have a feel that given the result data is dense and already in RAM,
your approach should already be the most safe & efficient one, though
benchmarks may favor other slight variants with different sized arrays,
the speed diffs should be neglectable.
But from overall architecture, I suggest it can be even more proficient
to mmap the data file into foreign ptr in order to back the array to
receive computation result, with virtual memory, then after the
computation, do `msync` to guarantee the data is written to non-volatile
storage. This puts no burden at GC in the first place, and of course
demands no further memory pinning etc. at all, by just leveraging the
os' virtual memory system (and modern file systems that tightly coupled
with it) for its designed purpose.
I'm myself doing a PoC of an array database thing, I'm currently using
the vector-mmap package's routine to finish the PoC. But the depended
mmap package lacks `msync`, and a test case suggests resource leakage
with GHC 8.6.5, so a stable solution is yet to be worked out ahead the way.
Btw, when you have more then 10k such array files to mmap, you'll hit
another limit - nofile for Linux, I used to implement a FUSE filesystem
providing virtual large data files viewing many small files on the
remote storage server, but written in Go, and am porting that to GHC in
the near future.
On 2020/4/3 上午3:25, cyberfined via Haskell-Cafe wrote:
> Hello, all. I decide to write parallel ray tracer on haskell with
> repa. Now, to save repa array to file I use dirty trick casting repa
> array ptr to bytestring with fromForeignPtr and then writing it to
> file with hPut. It looks something like that:
>
> import qualified Data.Array.Repa as R
> import qualified Data.Array.Repa.Repr.ForeignPtr as RF
>
> import qualified Data.ByteString as B
> import qualified Data.ByteString.Char8 as BC
> import qualified Data.ByteString.Internal as BI
>
> type Image = Array F DIM2 Pixel
>
> writeImage :: FilePath -> Image -> IO ()
> writeImage path img = bracket (openFile path WriteMode) (hClose) $
> \hdl -> B.hPut hdl header >> B.hPut hdl body
> where Z :. h :. w = R.extent img
> header = BC.pack $ "P6\n" ++ show w ++ ' ':show h ++ "\n255\n"
> body = BI.fromForeignPtr(castForeignPtr $ RF.toForeignPtr img)
> 0 (w*h*3)
>
> My question is: how to write repa array to file safely and fast?
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20200404/24aab54a/attachment.html>
More information about the Haskell-Cafe
mailing list