[Git][ghc/ghc][wip/jsbits-userguide] JS/userguide: wip explanation of writing jsbits

Josh Meredith (@JoshMeredith) gitlab at gitlab.haskell.org
Wed Sep 27 11:49:11 UTC 2023



Josh Meredith pushed to branch wip/jsbits-userguide at Glasgow Haskell Compiler / GHC


Commits:
a6b550bd by Josh Meredith at 2023-09-27T21:48:34+10:00
JS/userguide: wip explanation of writing jsbits

- - - - -


1 changed file:

- docs/users_guide/javascript.rst


Changes:

=====================================
docs/users_guide/javascript.rst
=====================================
@@ -173,3 +173,182 @@ We have to make sure not to use ``releaseCallback`` on any functions that
 are to be available in HTML, because we want these functions to be in
 memory indefinitely.
 
+Writing ``jsbits`` for Libraries with C FFI Functions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Many libraries make use of C FFI functions to accomplish low-level or
+performance sensitive operations - known as ``cbits`` and often kept in
+a folder with this name. For such a library to support the JavaScript
+backend, the ``cbits`` must have replacement implementations. Similar to
+the ``cbits``, JavaScript FFI files are known as the ``jsbits``.
+
+In principle, it is possible for the JavaScript backend to automatically
+compile ``cbits`` using Emscripten, but this requires wrappers to convert
+data between the JS backend's RTS data format, and the format expected by
+Emscripten-compiled functions. Since C functions are often used where
+performance is more critical, there's potential for the data conversions
+to negate this purpose.
+
+Instead, it is more effective for a library to provide an alternate
+implementation for functions using the C FFI - either by providing direct
+one-to-one replacement JavaScript functions, or by using C preprocessor
+directives to replace C FFI imports with some combination of JS FFI imports
+and pure-Haskell implementation.
+
+Direct Implementation of C FFI Imports in JavaScript
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When the JavaScript backend generates code for a C FFI import, it will call
+the function named in the import string, prepended by ``h$``. No verification
+is done to ensure that these functions are actually implemented in the linked
+JavaScript files, so there can be runtime errors when a missing JavaScript
+function is called.
+
+Based on this, implementing a C function in JavaScript is a matter of providing
+a function of the correct shape (based on the C FFI import type signature) in
+any of the linked JavaScript sources. External JavaScript sources are linked
+by either providing them as an argument to GHC, or listing them in the ``js-sources``
+field of the cabal file - in which case it would usually be inside a predicate to
+detect the ``javascript`` architecture, such as:
+
+.. code-block:: cabal
+
+  library
+
+    if arch(javascript)
+      js-sources:
+        jsbits/example.js
+
+The shape required of the JavaScript function will depend on the particular
+C types used:
+
+* primitives, such as ``CInt`` will map directly to a single JavaScript argument
+  using JavaScript primitives. In the case of ``CInt``, this will be a JavaScript
+  number. Note that in the case of return values, a JavaScript number will usually
+  need to be rounded or cast back to an integral value in cases where mathematical
+  operations are used
+
+* pointer values, including ``CString``, are passed as an unboxed ``(ptr, offset)``
+  pair. For arguments, being unboxed will mean these are passed as two top-level
+  arguments to the function. For return values, unboxed values are returned using
+  a special C preprocessor macro, ``RETURN_UBX_TUP2(ptr, offset)``
+
+* ``CString``, in addition to the above pointer handling, will need to be decoded
+  and encoded to convert them between character arrays and JavaScript strings.
+
+As an example, let's consider the implementation of ``getcwd``:
+
+.. code-block:: haskell
+
+  -- unix:System.Posix.Directory
+
+  foreign import ccall unsafe "getcwd" c_getcwd :: Ptr CChar -> CSize -> IO (Ptr CChar)
+
+.. code-block:: javascript
+
+  // libraries/base/jsbits/base.js
+
+  //#OPTIONS: CPP
+
+  function h$getcwd(buf, off, buf_size) {
+    try {
+      var cwd = h$encodeUtf8(process.cwd());
+      h$copyMutableByteArray(cwd, 0, buf, off, Math.min(cwd.len, buf_size));
+      RETURN_UBX_TUP2(cwd, 0);
+    } catch (e) {
+      h$setErrno(e);
+      return -1;
+    }
+  }
+
+Here, ``getcwd`` expects a ``CString`` (passed as the equivalent ``Ptr CChar``) and a
+``CSize`` argument. This results in three arguments to the JavaScript function - two
+for the string's pointer and offset, and one for the size, which will be passed as a
+JavaScript number.
+
+Next, the JavaScript ``h$getcwd`` function demonstrates a several details:
+
+* In the try clause, the ``cwd`` value is first accessed using a NodeJS-provided method.
+  This value is immediately encoded using ``h$encodeUtf8``, which is provided by the
+  JavaScript backend. This function will only return the pointer for the encoded value,
+  and the offset will always be 0
+
+* Next, another JavaScript backend function, ``h$copyMutableByteArray``, is used to
+  copy the newly encoded value and 0-offset into the provided pointer and offset. Because
+  these are C arrays, we must calculate the number of bytes to copy manually, which is
+  done here with the JavaScript ``Math.min`` to ensure that the copying doesn't overflow
+  past the end of the buffer
+
+* Lastly, the newly allocated buffer is returned to fulfill the behaviour expected by the
+  C function. This is done by ``RETURN_UBX_TUP2(x, y)``, which is a C preprocessor
+  macro that expands to place the second value in a special variable before ``return``-ing
+  the first value. Because it expands into a return statement, ``RETURN_UBX_TUP2`` can
+  be used for control flow as expected
+
+* To use C preprocessor macros in linked JavaScript files, the file must open with the
+  ``//#OPTIONS: CPP`` line, as is shown towards the start of this snippet
+
+* If an error occurs, the catch clause will pass it to ``h$setErrno`` and return -1 - which
+  is a behaviour expected by the JavaScript backend.
+
+Writing JavaScript Functions to be NodeJS and Browser Aware
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In the above example of implementing ``getcwd``, the function we use in the JavaScript
+implementation is from NodeJS, and the behaviour doesn't make sense to implement in a
+browser. Therefore, the actual implementation will include a C preprocessor condition
+to check if we're compiling for the browser, in which case ``h$unsupported(-1)`` will
+be called. There can be multiple non-browser JavaScript runtimes, so we'll also have
+to check at runtime to make sure that NodeJS is in use.
+
+.. code-block:: javascript
+
+  function h$getcwd(buf, off, buf_size) {
+  #ifndef GHCJS_BROWSER
+    if (h$isNode()) {
+      try {
+        var cwd = h$encodeUtf8(process.cwd());
+        h$copyMutableByteArray(cwd, 0, buf, off, Math.min(cwd.len, buf_size));
+        RETURN_UBX_TUP2(cwd, 0);
+      } catch (e) {
+        h$setErrno(e);
+        return -1;
+      }
+    } else
+  #endif
+      h$unsupported(-1);
+  }
+
+Using the C Preprocessor to Replace C FFI Imports
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Instead of providing a direct JavaScript implementation for each C FFI import, we can
+instead use the C preprocessor to conditionally remove these C imports (and possibly
+use sites as well). Then, some combination of JavaScript FFI imports and Haskell
+implementation can be added instead. As in the direct implementation section, any
+linked JavaScript files should usually be in a ``if arch(javascript)`` condition in
+the cabal file.
+
+As an example of a mixed Haskell and JavaScript implementation replacing a C
+implementation, consider ``base:GHC.Clock``:
+
+.. code-block:: haskell
+
+  #if defined(javascript_HOST_ARCH)
+  getMonotonicTimeNSec :: IO Word64
+  getMonotonicTimeNSec = do
+    w <- getMonotonicTimeMSec
+    return (floor w * 1000000)
+
+  foreign import javascript unsafe "performance.now"
+    getMonotonicTimeMSec :: IO Double
+
+  #else
+  foreign import ccall unsafe "getMonotonicNSec"
+    getMonotonicTimeNSec :: IO Word64
+  #endif
+
+Here, the ``getMonotonicTimeNSec`` C FFI import is replaced by the JavaScript FFI
+import ``getMonotonicTimeMSec``. However, because the JavaScript implementation
+returns the time as a ``Double`` of floating point seconds, it must be wrapped by
+a Haskell function to extract the integral value that's expected.



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/a6b550bd30c9c0843bc5c25aad9a7f17c30df181

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/a6b550bd30c9c0843bc5c25aad9a7f17c30df181
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230927/882a0394/attachment-0001.html>


More information about the ghc-commits mailing list