pearsonkyle / Sharp-coreml

huggingface.co
Total runs: 679
24-hour runs: 3
7-day runs: 4
30-day runs: 588
Model's Last Updated: January 09 2026
image-to-3d

Introduction of Sharp-coreml

Model Details of Sharp-coreml

Sharp Monocular View Synthesis in Less Than a Second (Core ML Edition)

Project Page arXiv

This software project is a communnity contribution and not affiliated with the original the research paper:

Sharp Monocular View Synthesis in Less Than a Second by Lars Mescheder, Wei Dong, Shiwei Li, Xuyang Bai, Marcel Santos, Peiyun Hu, Bruno Lecouat, Mingmin Zhen, Amaël Delaunoy, Tian Fang, Yanghai Tsin, Stephan Richter and Vladlen Koltun .

We present SHARP, an approach to photorealistic view synthesis from a single image. Given a single photograph, SHARP regresses the parameters of a 3D Gaussian representation of the depicted scene. This is done in less than a second on a standard GPU via a single feedforward pass through a neural network. The 3D Gaussian representation produced by SHARP can then be rendered in real time, yielding high-resolution photorealistic images for nearby views. The representation is metric, with absolute scale, supporting metric camera movements.

This release includes a fully validated Core ML (.mlpackage) version of SHARP, optimized for CPU, GPU, and Neural Engine inference on macOS and iOS.

Rendered using Splat Viewer

Getting started
📦 Download the Core ML Model Only
pip install huggingface-hub
huggingface-cli download --include sharp.mlpackage/ --local-dir . pearsonkyle/Sharp-coreml
🧰 Clone the Full Repository

This will include the inference and model conversion/validation scripts.

brew install git-xet
git xet install

Clone the model repository:

git clone [email protected]:pearsonkyle/Sharp-coreml
📱 Run Inference on Apple Devices

Use the provided sharp.swift inference script to load the model and generate 3D Gaussian splats (PLY) from any image:

# Compile the Swift runner (requires Xcode command-line tools)
swiftc -O -o run_sharp sharp.swift -framework CoreML -framework CoreImage -framework AppKit

# Run inference on an image and decimate the output by 50%
./run_sharp sharp.mlpackage city.png city.ply -d 0.5

Inference on an Apple M4 Max takes ~1.9 seconds.

CLI Features:

  • Automatic model compilation and caching
  • Decimation to reduce point cloud size while preserving visual fidelity
  • Input is expected as a standard RGB image; conversion to [0,1] and CHW format happens inside the model
  • PLY output compatible with Splat Viewer , MetalSplatter , and Three.js
Usage: \(execName) [OPTIONS] <model> <input_image> <output.ply>

SHARP Model Inference - Generate 3D Gaussian Splats from a single image

Arguments:
    model              Path to the SHARP Core ML model (.mlpackage, .mlmodel, or .mlmodelc)
    input_image        Path to input image (PNG, JPEG, etc.)
    output.ply         Path for output PLY file

Options: 
    -m, --model PATH           Path to Core ML model
    -i, --input PATH           Path to input image
    -o, --output PATH          Path for output PLY file
    -f, --focal-length FLOAT   Focal length in pixels (default: 1536)
    -d, --decimation FLOAT     Decimation ratio 0.0-1.0 or percentage 1-100 (default:  1.0 = keep all)
                                Example: 0.5 or 50 keeps 50% of Gaussians
    -h, --help                 Show this help message
Model Input and Output
📥 Input

The Core ML model accepts two inputs:

  • image : A 3-channel RGB image in uint8 format with shape (1, 3, H, W) .

    • Values are expected in range [0, 255] (no manual normalization required).
    • Recommended resolution: 1536×1536 (matches training size).
    • Aspect ratio is preserved; input will be resized internally if needed.
  • disparity_factor : A scalar tensor of shape (1,) representing the ratio focal_length / image_width .

    • Use 1.0 for standard cameras (e.g., typical smartphone or DSLR).
    • Adjust slightly to control depth scale: higher values = closer objects, lower values = farther scenes.
    • If using the sharp.swift runner, this input is automatically computed from your image dimensions.
📤 Output

The model outputs five tensors representing a 3D Gaussian splat representation:

Output Shape Description
mean_vectors_3d_positions (1, N, 3) 3D positions in Normalized Device Coordinates (NDC) — x, y, z.
singular_values_scales (1, N, 3) Scale parameters along each principal axis (width, height, depth).
quaternions_rotations (1, N, 4) Unit quaternions [w, x, y, z] encoding orientation of each Gaussian.
colors_rgb_linear (1, N, 3) Linear RGB color values in range [0, 1] (no gamma correction).
opacities_alpha_channel (1, N) Opacity (alpha) values per Gaussian, in range [0, 1] .

The total number of Gaussians N is approximately 1,179,648 for the default model.

🌍 These outputs are fully compatible with Splat Viewer and MetalSplatter .

🔍 Model Validation Results

The Core ML model has been rigorously validated against the original PyTorch implementation. Below are the numerical accuracy metrics across all 5 output tensors:

Output Max Diff Mean Diff P99 Diff Angular Diff (°) Status
Mean Vectors (3D Positions) 0.000794 0.000049 0.000094 - ✅ PASS
Singular Values (Scales) 0.000035 0.000000 0.000002 - ✅ PASS
Quaternions (Rotations) 1.425558 0.000024 0.000067 9.2519 / 0.0019 / 0.0396 ✅ PASS
Colors (RGB Linear) 0.001440 0.000005 0.000055 - ✅ PASS
Opacities (Alpha) 0.004183 0.000005 0.000114 - ✅ PASS

Validation Notes:

  • All outputs match PyTorch within 0.01% mean error.
  • Quaternion angular errors are below 1° for 99% of Gaussians.
Reproducing the Conversion

To reproduce the conversion from PyTorch to Core ML, follow these steps:

git clone https://github.com/apple/ml-sharp.git
cd ml-sharp
conda create -n sharp python=3.13
conda activate sharp
pip install -r requirements.txt
pip install coremltools
cd ../
python convert.py
Citation

If you find this work useful, please cite the original paper:

@inproceedings{Sharp2025:arxiv,
  title      = {Sharp Monocular View Synthesis in Less Than a Second},
  author     = {Lars Mescheder and Wei Dong and Shiwei Li and Xuyang Bai and Marcel Santos and Peiyun Hu and Bruno Lecouat and Mingmin Zhen and Ama\"{e}l Delaunoy and Tian Fang and Yanghai Tsin and Stephan R. Richter and Vladlen Koltun},
  journal    = {arXiv preprint arXiv:2512.10685},
  year       = {2025},
  url        = {https://arxiv.org/abs/2512.10685},
}

Runs of pearsonkyle Sharp-coreml on huggingface.co

679
Total runs
3
24-hour runs
4
3-day runs
4
7-day runs
588
30-day runs

More Information About Sharp-coreml huggingface.co Model

More Sharp-coreml license Visit here:

https://choosealicense.com/licenses/apple-amlr

Sharp-coreml huggingface.co

Sharp-coreml huggingface.co is an AI model on huggingface.co that provides Sharp-coreml's model effect (), which can be used instantly with this pearsonkyle Sharp-coreml model. huggingface.co supports a free trial of the Sharp-coreml model, and also provides paid use of the Sharp-coreml. Support call Sharp-coreml model through api, including Node.js, Python, http.

pearsonkyle Sharp-coreml online free

Sharp-coreml huggingface.co is an online trial and call api platform, which integrates Sharp-coreml's modeling effects, including api services, and provides a free online trial of Sharp-coreml, you can try Sharp-coreml online for free by clicking the link below.

pearsonkyle Sharp-coreml online free url in huggingface.co:

https://huggingface.co/pearsonkyle/Sharp-coreml

Sharp-coreml install

Sharp-coreml is an open source model from GitHub that offers a free installation service, and any user can find Sharp-coreml on GitHub to install. At the same time, huggingface.co provides the effect of Sharp-coreml install, users can directly use Sharp-coreml installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

Sharp-coreml install url in huggingface.co:

https://huggingface.co/pearsonkyle/Sharp-coreml

Url of Sharp-coreml

Provider of Sharp-coreml huggingface.co

pearsonkyle
ORGANIZATIONS

Other API from pearsonkyle