Tyler Williams's picture
Building on HF

Tyler Williams PRO

unmodeled-tyler

AI & ML interests

Founder of Quanta Intellect/VANTA Research - Looking to get in touch? Head to my website!

Recent Activity

reacted to alvdansen's post with 🔥 13 minutes ago
Just open-sourced LoRA Gym with Timothy - production-ready training pipeline for character, motion, aesthetic, and style LoRAs on Wan 2.1/2.2, built on musubi-tuner. 16 training templates across Modal (serverless) and RunPod (bare metal) covering T2V, I2V, Lightning-merged, and vanilla variants. Our current experimentation focus is Wan 2.2, which is why we built on musubi-tuner (kohya-ss). Wan 2.2's DiT uses a Mixture-of-Experts architecture with two separate experts gated by a hard timestep switch - you're training two LoRAs per concept, one for high-noise (composition/motion) and one for low-noise (texture/identity), and loading both at inference. Musubi handles this dual-expert training natively, and our templates build on top of it to manage the correct timestep boundaries, precision settings, and flow shift values so you don't have to debug those yourself. We've also documented bug fixes for undocumented issues in musubi-tuner and validated hyperparameter defaults derived from cross-referencing multiple practitioners' results rather than untested community defaults. Also releasing our auto-captioning toolkit for the first time. Per-LoRA-type captioning strategies for characters, styles, motion, and objects. Gemini (free) or Replicate backends. Current hyperparameters reflect consolidated community findings. We've started our own refinement and plan to release specific recommendations and methodology as soon as next week. Repo: github.com/alvdansen/lora-gym
reacted to AbstractPhil's post with 👀 about 17 hours ago
The Rosetta Stone geometric vocabulary and the ramping up capacity. What makes this particular invariant special, is the existence within all structures I've tested so far. I had Claude write up the direct article based on what we built together, but I've tested it on many substructures. This is flawed, and I have a series of answers to making it more accurate. First a reconstruction from the ground up. This means each shape is specifically built upward from the substructure to the point of inductive deviance. This will be less quick at first and then build speed as I optimize like the last system did. The "saddle" problem; the system detected saddles because there wasn't enough deviance in the shapes to attenuate to more cardinality and more aligned substructures. The blobs were around 30-40% of the overall patches, which interpolated into the others produced a fair approximation. It MOST DEFINITELY did see those shapes in their voxel complexity. This is real. https://claude.ai/public/artifacts/bf1256c7-726d-4943-88ad-d6addb263b8b You can play with a public claude artifact dedicated to viewing the current shape spectrum - and with that know exactly why it's flawed. The flawed and repetitive shapes. I rapid prototyped and there are multiple redundant shapes that simply don't classify well or at all. Not to mention the rotation simply doesn't help much of the time, or doesn't exist with many shapes. This will be rectified in the next variation. Projecting to shared latent space as a catalyst to allow growing subjective geoflow matched step variance, rather than simply direct classification. This will theoretically allow for full channel-to-channel invariant features to be mapped from structure to structure, and the very formula that encapsulated them to be directly baked into the math rather than classified as a substructure analysis. There are many challenges between here and there, so stay tuned my friends as I plot the geometric language of pretrained AI.
View all activity

Organizations

Blog-explorers's profile picture VANTA Research's profile picture