Sharp Monocular View Synthesis in Less Than a Second (Core ML Edition)
This software project is a communnity contribution and not affiliated with the original the research paper:
Sharp Monocular View Synthesis in Less Than a Second
by
Lars Mescheder, Wei Dong, Shiwei Li, Xuyang Bai, Marcel Santos, Peiyun Hu, Bruno Lecouat, Mingmin Zhen, Amaël Delaunoy, Tian Fang, Yanghai Tsin, Stephan Richter and Vladlen Koltun
.
We present SHARP, an approach to photorealistic view synthesis from a single image. Given a single photograph, SHARP regresses the parameters of a 3D Gaussian representation of the depicted scene. This is done in less than a second on a standard GPU via a single feedforward pass through a neural network. The 3D Gaussian representation produced by SHARP can then be rendered in real time, yielding high-resolution photorealistic images for nearby views. The representation is metric, with absolute scale, supporting metric camera movements.
This release includes a fully validated
Core ML (.mlpackage)
version of SHARP, optimized for CPU, GPU, and Neural Engine inference on macOS and iOS.
Use the provided
sharp.swift
inference script to load the model and generate 3D Gaussian splats (PLY) from any image:
# Compile the Swift runner (requires Xcode command-line tools)
swiftc -O -o run_sharp sharp.swift -framework CoreML -framework CoreImage -framework AppKit
# Run inference on an image and decimate the output by 50%
./run_sharp sharp.mlpackage city.png city.ply -d 0.5
Inference on an Apple M4 Max takes ~1.9 seconds.
CLI Features:
Automatic model compilation and caching
Decimation to reduce point cloud size while preserving visual fidelity
Input is expected as a standard RGB image; conversion to [0,1] and CHW format happens inside the model
Usage: \(execName) [OPTIONS] <model> <input_image> <output.ply>
SHARP Model Inference - Generate 3D Gaussian Splats from a single image
Arguments:
model Path to the SHARP Core ML model (.mlpackage, .mlmodel, or .mlmodelc)
input_image Path to input image (PNG, JPEG, etc.)
output.ply Path for output PLY file
Options:
-m, --model PATH Path to Core ML model
-i, --input PATH Path to input image
-o, --output PATH Path for output PLY file
-f, --focal-length FLOAT Focal length in pixels (default: 1536)
-d, --decimation FLOAT Decimation ratio 0.0-1.0 or percentage 1-100 (default: 1.0 = keep all)
Example: 0.5 or 50 keeps 50% of Gaussians
-h, --help Show this help message
Model Input and Output
📥 Input
The Core ML model accepts two inputs:
image
: A 3-channel RGB image in
uint8
format with shape
(1, 3, H, W)
.
Values are expected in range
[0, 255]
(no manual normalization required).
Recommended resolution:
1536×1536
(matches training size).
Aspect ratio is preserved; input will be resized internally if needed.
disparity_factor
: A scalar tensor of shape
(1,)
representing the ratio
focal_length / image_width
.
Use
1.0
for standard cameras (e.g., typical smartphone or DSLR).
Adjust slightly to control depth scale: higher values = closer objects, lower values = farther scenes.
If using the
sharp.swift
runner, this input is automatically computed from your image dimensions.
📤 Output
The model outputs five tensors representing a 3D Gaussian splat representation:
Output
Shape
Description
mean_vectors_3d_positions
(1, N, 3)
3D positions in Normalized Device Coordinates (NDC) — x, y, z.
singular_values_scales
(1, N, 3)
Scale parameters along each principal axis (width, height, depth).
quaternions_rotations
(1, N, 4)
Unit quaternions
[w, x, y, z]
encoding orientation of each Gaussian.
colors_rgb_linear
(1, N, 3)
Linear RGB color values in range
[0, 1]
(no gamma correction).
opacities_alpha_channel
(1, N)
Opacity (alpha) values per Gaussian, in range
[0, 1]
.
The total number of Gaussians
N
is approximately 1,179,648 for the default model.
The Core ML model has been rigorously validated against the original PyTorch implementation. Below are the numerical accuracy metrics across all 5 output tensors:
Output
Max Diff
Mean Diff
P99 Diff
Angular Diff (°)
Status
Mean Vectors (3D Positions)
0.000794
0.000049
0.000094
-
✅ PASS
Singular Values (Scales)
0.000035
0.000000
0.000002
-
✅ PASS
Quaternions (Rotations)
1.425558
0.000024
0.000067
9.2519 / 0.0019 / 0.0396
✅ PASS
Colors (RGB Linear)
0.001440
0.000005
0.000055
-
✅ PASS
Opacities (Alpha)
0.004183
0.000005
0.000114
-
✅ PASS
Validation Notes:
All outputs match PyTorch within 0.01% mean error.
Quaternion angular errors are below 1° for 99% of Gaussians.
Reproducing the Conversion
To reproduce the conversion from PyTorch to Core ML, follow these steps:
If you find this work useful, please cite the original paper:
@inproceedings{Sharp2025:arxiv,
title = {Sharp Monocular View Synthesis in Less Than a Second},
author = {Lars Mescheder and Wei Dong and Shiwei Li and Xuyang Bai and Marcel Santos and Peiyun Hu and Bruno Lecouat and Mingmin Zhen and Ama\"{e}l Delaunoy and Tian Fang and Yanghai Tsin and Stephan R. Richter and Vladlen Koltun},
journal = {arXiv preprint arXiv:2512.10685},
year = {2025},
url = {https://arxiv.org/abs/2512.10685},
}
Runs of pearsonkyle Sharp-coreml on huggingface.co
679
Total runs
3
24-hour runs
4
3-day runs
4
7-day runs
588
30-day runs
More Information About Sharp-coreml huggingface.co Model
Sharp-coreml huggingface.co is an AI model on huggingface.co that provides Sharp-coreml's model effect (), which can be used instantly with this pearsonkyle Sharp-coreml model. huggingface.co supports a free trial of the Sharp-coreml model, and also provides paid use of the Sharp-coreml. Support call Sharp-coreml model through api, including Node.js, Python, http.
Sharp-coreml huggingface.co is an online trial and call api platform, which integrates Sharp-coreml's modeling effects, including api services, and provides a free online trial of Sharp-coreml, you can try Sharp-coreml online for free by clicking the link below.
pearsonkyle Sharp-coreml online free url in huggingface.co:
Sharp-coreml is an open source model from GitHub that offers a free installation service, and any user can find Sharp-coreml on GitHub to install. At the same time, huggingface.co provides the effect of Sharp-coreml install, users can directly use Sharp-coreml installed effect in huggingface.co for debugging and trial. It also supports api for free installation.