Skip to content

cs_loader: filesystem_error crash when LOADER_SCRIPT_PATH directory does not exist #789

@hulxv

Description

@hulxv

🐛 Bug Report

The cs_loader terminates the entire process with an uncaught
std::experimental::filesystem::filesystem_error when its assembly search
path does not exist at runtime. This makes the loader completely unusable in
any deployment environment that differs from the build environment (e.g.
multi-stage Docker builds, CI/CD pipelines, installed packages).

Expected Behavior

metacall_load_from_memory("cs", ...) should return an error (or at worst
fail to initialize the loader gracefully) when the configured assembly
directory does not exist. The process must not be killed.

Current Behavior

The process aborts immediately with:

terminate called after throwing an instance of
  'std::experimental::filesystem::v1::__cxx11::filesystem_error'
  what(): filesystem error: directory iterator cannot open directory:
          No such file or directory [/metacall/build]
Aborted (Signal sent by tkill() 1 0)

The crash happens inside cs_loader_impl_initialize before any user code
runs. Every other already-loaded language loader is also lost because the
whole process dies.

Possible Solution

  1. Replace the throwing directory_iterator(path) overload with the
    std::error_code overload and return early / log a warning when the
    directory is absent.
  2. Verify that ConfigAssemblyName resolves its search path from the runtime
    environment (LOADER_SCRIPT_PATH / CONFIGURATION_PATH) rather than a
    path that is only valid during the build.

Steps to Reproduce

  1. Clone metacall/core and build it inside /metacall (standard
    metacall-build.sh relwithdebinfo install).
  2. Create a separate runtime environment that does not contain
    /metacall/build (e.g. a Docker FROM debian:trixie-slim stage that only
    copies the installed files from /usr/local).
  3. Set the standard runtime env vars:
    LOADER_LIBRARY_PATH=/usr/local/lib
    LOADER_SCRIPT_PATH=/usr/local/scripts
    CONFIGURATION_PATH=/usr/local/share/metacall/configurations/global.json
    
  4. From Python, call:
    from metacall import metacall_load_from_memory
    metacall_load_from_memory("cs", "static double f(double[] a) { return a[0]; }")
  5. The process aborts with the filesystem_error above instead of returning
    an error code.

A minimal reproducer Dockerfile:

FROM debian:trixie-slim AS build
# ... build metacall normally into /usr/local ...

FROM debian:trixie-slim AS runtime
COPY --from=build /usr/local /usr/local
# /metacall/build does NOT exist in this stage
ENV LOADER_LIBRARY_PATH=/usr/local/lib \
    LOADER_SCRIPT_PATH=/usr/local/scripts \
    CONFIGURATION_PATH=/usr/local/share/metacall/configurations/global.json
RUN python3 -c "from metacall import metacall_load_from_memory; \
                metacall_load_from_memory('cs', 'static double f(double[] a){return a[0];}')"
# → process aborts

Context (Environment)

  • metacall: built from master (shallow clone, May 2025)
  • Host OS: Debian trixie (x86-64)
  • .NET runtime: 8.0 (installed via dotnet-install.sh --channel 8.0 --runtime dotnet)
  • Triggered by: any metacall_load_from_memory("cs", ...) call when the
    build tree is absent from the runtime environment

The crash prevents using the cs_loader in any containerised or packaged
deployment, which is the dominant way metacall is used in production.

Detailed Description

The relevant call chain from the stack trace:

cs_loader_impl_initialize        (cs_loader_impl.c:236)
  └─ simple_netcore::start       (simple_netcore.cpp:38)
       └─ netcore_linux::start   (netcore_linux.cpp:247)
            └─ ConfigAssemblyName(netcore_linux.cpp:87)
                 └─ directory_iterator(path)   ← throws here
                      (netcore_linux.h:100)

ConfigAssemblyName (line 87, netcore_linux.cpp) constructs a
std::experimental::filesystem::directory_iterator using the throwing
overload. When the target directory (/metacall/build in our case) does not
exist the constructor raises filesystem_error. Nothing in the call stack
catches it, so std::terminate() is invoked and the process aborts.

There are two compounding problems:

# Problem Location
1 directory_iterator(path) used instead of directory_iterator(path, ec) netcore_linux.h:100
2 The path being iterated resolves to the build directory (/metacall/build) rather than the runtime LOADER_SCRIPT_PATH netcore_linux.cpp:87 (path derivation)

Problem 1 is the direct cause of the crash. Problem 2 is why the wrong path
is used even when LOADER_SCRIPT_PATH is set correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions