Download

Skill UI

Browse and discover 10574+ curated skills

All Development Artificial Intelligence Design & Creative Product & Business Data Science Marketing Soft Skills Productivity Engineering Languages

Search RLHF , found 3 results

Default Newest Most Downloaded

TRL RLHF Pipeline

fine-tuning-with-trl

Orchestra-Research/AI-Research-SKILLs

Fine-tune LLMs with TRL’s RLHF toolkit, covering SFT, DPO, PPO/GRPO and reward model training so you can align HuggingFace Transformers with human preferences or feedback in post-training workflows.

GRPO RL Fine-Tuning

grpo-rl-training

Orchestra-Research/AI-Research-SKILLs

Provides expert guidance for GRPO training with TRL, covering dataset prep, reward engineering, structured outputs, and multi-objective alignment when preference pairs are scarce.

OpenRLHF Training Suite

openrlhf-training

Orchestra-Research/AI-Research-SKILLs

High-performance RLHF framework that combines Ray, vLLM, and ZeRO-3 to accelerate PPO/GRPO/RLOO/DPO training for 7B-70B+ models on distributed GPU clusters, providing a single-stack workflow for reward models and policy optimization.

1

Language