[Git][ghc/ghc][wip/jsbits-userguide] JS/userguide: wip explanation of writing jsbits
Josh Meredith (@JoshMeredith)
gitlab at gitlab.haskell.org
Wed Sep 27 13:14:18 UTC 2023
Josh Meredith pushed to branch wip/jsbits-userguide at Glasgow Haskell Compiler / GHC
Commits:
526d45d0 by Josh Meredith at 2023-09-27T23:14:07+10:00
JS/userguide: wip explanation of writing jsbits
- - - - -
1 changed file:
- docs/users_guide/javascript.rst
Changes:
=====================================
docs/users_guide/javascript.rst
=====================================
@@ -173,3 +173,205 @@ We have to make sure not to use ``releaseCallback`` on any functions that
are to be available in HTML, because we want these functions to be in
memory indefinitely.
+Writing ``jsbits`` for Libraries with C FFI Functions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Many libraries make use of C FFI functions to accomplish low-level or
+performance sensitive operations - known as ``cbits`` and often kept in
+a folder with this name. For such a library to support the JavaScript
+backend, the ``cbits`` must have replacement implementations. Similar to
+the ``cbits``, JavaScript FFI files are known as the ``jsbits``.
+
+In principle, it is possible for the JavaScript backend to automatically
+compile ``cbits`` using Emscripten, but this requires wrappers to convert
+data between the JS backend's RTS data format, and the format expected by
+Emscripten-compiled functions. Since C functions are often used where
+performance is more critical, there's potential for the data conversions
+to negate this purpose.
+
+Instead, it is more effective for a library to provide an alternate
+implementation for functions using the C FFI - either by providing direct
+one-to-one replacement JavaScript functions, or by using C preprocessor
+directives to replace C FFI imports with some combination of JS FFI imports
+and pure-Haskell implementation.
+
+Direct Implementation of C FFI Imports in JavaScript
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When the JavaScript backend generates code for a C FFI import, it will call
+the function named in the import string, prepended by ``h$`` - so the imported
+C function ``open`` will look for the JavaScript function ``h$open``. No verification
+is done to ensure that these functions are actually implemented in the linked
+JavaScript files, so there can be runtime errors when a missing JavaScript
+function is called.
+
+Based on this, implementing a C function in JavaScript is a matter of providing
+a function of the correct shape (based on the C FFI import type signature) in
+any of the linked JavaScript sources. External JavaScript sources are linked
+by either providing them as an argument to GHC, or listing them in the ``js-sources``
+field of the cabal file - in which case it would usually be inside a predicate to
+detect the ``javascript`` architecture, such as:
+
+.. code-block:: cabal
+
+ library
+
+ if arch(javascript)
+ js-sources:
+ jsbits/example.js
+
+Note that ``js-sources`` requires Cabal 3.10 to be used with library targets, and
+Cabal 3.12 to be used with executable targets.
+
+The shape required of the JavaScript function will depend on the particular
+C types used:
+
+* primitives, such as ``CInt`` will map directly to a single JavaScript argument
+ using JavaScript primitives. In the case of ``CInt``, this will be a JavaScript
+ number. Note that in the case of return values, a JavaScript number will usually
+ need to be rounded or cast back to an integral value in cases where mathematical
+ operations are used
+
+* pointer values, including ``CString``, are passed as an unboxed ``(ptr, offset)``
+ pair. For arguments, being unboxed will mean these are passed as two top-level
+ arguments to the function. For return values, unboxed values are returned using
+ a special C preprocessor macro, ``RETURN_UBX_TUP2(ptr, offset)``
+
+* ``CString``, in addition to the above pointer handling, will need to be decoded
+ and encoded to convert them between character arrays and JavaScript strings.
+
+As an example, let's consider the implementation of ``getcwd``:
+
+.. code-block:: haskell
+
+ -- unix:System.Posix.Directory
+
+ foreign import ccall unsafe "getcwd" c_getcwd :: Ptr CChar -> CSize -> IO (Ptr CChar)
+
+.. code-block:: javascript
+
+ // libraries/base/jsbits/base.js
+
+ //#OPTIONS: CPP
+
+ function h$getcwd(buf, off, buf_size) {
+ try {
+ var cwd = h$encodeUtf8(process.cwd());
+ h$copyMutableByteArray(cwd, 0, buf, off, Math.min(cwd.len, buf_size));
+ RETURN_UBX_TUP2(cwd, 0);
+ } catch (e) {
+ h$setErrno(e);
+ return -1;
+ }
+ }
+
+Here, the C function ``getcwd`` maps to the JavaScript function ``h$getcwd``, which
+exists in a ``.js`` file within ``base``'s ``jsbits`` subdirectory. ``h$getcwd``
+expects a ``CString`` (passed as the equivalent ``Ptr CChar``) and a
+``CSize`` argument. This results in three arguments to the JavaScript function - two
+for the string's pointer and offset, and one for the size, which will be passed as a
+JavaScript number.
+
+Next, the JavaScript ``h$getcwd`` function demonstrates a several details:
+
+* In the try clause, the ``cwd`` value is first accessed using a NodeJS-provided method.
+ This value is immediately encoded using ``h$encodeUtf8``, which is provided by the
+ JavaScript backend. This function will only return the pointer for the encoded value,
+ and the offset will always be 0
+
+* Next, another JavaScript backend function, ``h$copyMutableByteArray``, is used to
+ copy the newly encoded value and 0-offset into the provided pointer and offset. Because
+ these are C arrays, we must calculate the number of bytes to copy manually, which is
+ done here with the JavaScript function ``Math.min`` to ensure that the copying doesn't
+ overflow past the end of the buffer
+
+* Lastly, the newly allocated buffer is returned to fulfill the behaviour expected by the
+ C function. This is done by ``RETURN_UBX_TUP2(x, y)``, which is a C preprocessor
+ macro that expands to place the second value in a special variable before ``return``-ing
+ the first value. Because it expands into a return statement, ``RETURN_UBX_TUP2`` can
+ be used for control flow as expected
+
+* To use C preprocessor macros in linked JavaScript files, the file must open with the
+ ``//#OPTIONS: CPP`` line, as is shown towards the start of this snippet
+
+* If an error occurs, the catch clause will pass it to ``h$setErrno`` and return -1 - which
+ is a behaviour expected by the JavaScript backend.
+
+Writing JavaScript Functions to be NodeJS and Browser Aware
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In the above example of implementing ``getcwd``, the function we use in the JavaScript
+implementation is from NodeJS, and the behaviour doesn't make sense to implement in a
+browser. Therefore, the actual implementation will include a C preprocessor condition
+to check if we're compiling for the browser, in which case ``h$unsupported(-1)`` will
+be called. There can be multiple non-browser JavaScript runtimes, so we'll also have
+to check at runtime to make sure that NodeJS is in use.
+
+.. code-block:: javascript
+
+ function h$getcwd(buf, off, buf_size) {
+ #ifndef GHCJS_BROWSER
+ if (h$isNode()) {
+ try {
+ var cwd = h$encodeUtf8(process.cwd());
+ h$copyMutableByteArray(cwd, 0, buf, off, Math.min(cwd.len, buf_size));
+ RETURN_UBX_TUP2(cwd, 0);
+ } catch (e) {
+ h$setErrno(e);
+ return -1;
+ }
+ } else
+ #endif
+ h$unsupported(-1);
+ }
+
+Using the C Preprocessor to Replace C FFI Imports
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Instead of providing a direct JavaScript implementation for each C FFI import, we can
+instead use the C preprocessor to conditionally remove these C imports (and possibly
+use sites as well). Then, some combination of JavaScript FFI imports and Haskell
+implementation can be added instead. As in the direct implementation section, any
+linked JavaScript files should usually be in a ``if arch(javascript)`` condition in
+the cabal file.
+
+As an example of a mixed Haskell and JavaScript implementation replacing a C
+implementation, consider ``base:GHC.Clock``:
+
+.. code-block:: haskell
+
+ #if defined(javascript_HOST_ARCH)
+ getMonotonicTimeNSec :: IO Word64
+ getMonotonicTimeNSec = do
+ w <- getMonotonicTimeMSec
+ return (floor w * 1000000)
+
+ foreign import javascript unsafe "performance.now"
+ getMonotonicTimeMSec :: IO Double
+
+ #else
+ foreign import ccall unsafe "getMonotonicNSec"
+ getMonotonicTimeNSec :: IO Word64
+ #endif
+
+Here, the ``getMonotonicTimeNSec`` C FFI import is replaced by the JavaScript FFI
+import ``getMonotonicTimeMSec``, which imports the standard JavaScript function
+``performance.now``. However, because this JavaScript implementation
+returns the time as a ``Double`` of floating point milliseconds, it must be wrapped
+by a Haskell function to extract the integral value that's expected.
+
+In this case, the choice of using a mixed Haskell and JavaScript replacement
+implementation was caused by the limitation of clocks being system calls. In a lot
+of cases, C functions are used for similar system-level functionality. In such
+cases, it's recommended to import the required system functions from standard
+JavaScript libraries (or NodeJS/browser, as was required for ``getcwd``), and
+use Haskell wrapper functions to convert the imported functions to the appropriate
+format.
+
+In other cases, C functions are used for performance. For these cases, pure-Haskell
+implementations are the preferred first step for compatability with the JavaScript
+backend since it would be more future-proof against changes to the RTS data format.
+Depending on the use case, compiler-optimised JS code might be hard to complete with
+using hand-written JavaScript. Generally, the most likely performance gains from
+hand-written JavaScript come from functions with data that stays as JavaScript
+primitive types for a long time, especially strings.
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/526d45d096e131bdb02187ba52a4342441d55251
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/526d45d096e131bdb02187ba52a4342441d55251
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230927/35d30120/attachment-0001.html>
More information about the ghc-commits
mailing list