MallardFlow
Generate mallards.
A compact flow-matching mallard image model running behind SageMaker async inference. Current saved model: version 1.0.0.
No job yet
50 calls/hour
-
Architecture
MallardFlow 1.0.0
The release is a small research-grade flow-matching stack, not a single monolithic checkpoint. The generator works in a learned latent space and uses a curated bank of high-quality mallard examples to anchor global structure.
-
01
Input
Curated mallard crops
Training data comes from verified mallard photos and high-quality segmented web crops. Hard negatives helped build the detector and reward filters, but the generator release is trained to model mallard-like positive examples.
-
02
Encoder
128 px Mallard VAE
Images are encoded into a 24 x 32 x 32 latent tensor. Training in this space keeps the model small enough for local Apple Silicon experiments while preserving more structure than the earlier tiny latent attempts.
-
03
Prior
LatentUNetFlowV5
A reward-weighted LatentUNetFlowV5 prior learns a continuous velocity field from noise toward plausible mallard latents. We sample it with Heun integration and a cosine time schedule.
-
04
Structure
Top-256 condition bank projection
The prior output is projected toward nearby entries in a top-256 mallard condition bank. This is the main global-structure aid: it reduces drifting shapes and keeps poses closer to real examples.
-
05
Refiner
LatentUNetFlowV6
A LatentUNetFlowV6 refiner maps the projected condition into the final high-resolution latent. This stage is where feet, body outline, and water/feather transitions improved most.
-
06
Decoder
VAE RGB decode
The final latent tensor is decoded back to a 128 px RGB mallard image grid. The released sampler returns this PNG grid as base64 in the inference response.
-
07
Serving
SageMaker async wrapper
The web app writes requests to S3, invokes SageMaker async inference, then polls for a JSON result containing a generated PNG grid. The endpoint is normally kept offline to avoid idle GPU cost.
- Version
- 1.0.0
- Release tag
- v1.0.0
- Default sampling
- nearest-k 4, temperature 0.15, prior blend 0.30
- Artifact
- mallard-flow-v1.0.0-model.tar.gz