Login
Download
Skill UI
Browse and discover
10574+
curated skills
All
Development
Artificial Intelligence
Design & Creative
Product & Business
Data Science
Marketing
Soft Skills
Productivity
Engineering
Languages
Search
RLHF
, found
3
results
Default
Newest
Most Downloaded
TRL RLHF Pipeline
fine-tuning-with-trl
Orchestra-Research/AI-Research-SKILLs
486
Fine-tune LLMs with TRL’s RLHF toolkit, covering SFT, DPO, PPO/GRPO and reward model training so you can align HuggingFace Transformers with human preferences or feedback in post-training workflows.
View Details
GRPO RL Fine-Tuning
grpo-rl-training
Orchestra-Research/AI-Research-SKILLs
416
Provides expert guidance for GRPO training with TRL, covering dataset prep, reward engineering, structured outputs, and multi-objective alignment when preference pairs are scarce.
View Details
OpenRLHF Training Suite
openrlhf-training
Orchestra-Research/AI-Research-SKILLs
474
High-performance RLHF framework that combines Ray, vLLM, and ZeRO-3 to accelerate PPO/GRPO/RLOO/DPO training for 7B-70B+ models on distributed GPU clusters, providing a single-stack workflow for reward models and policy optimization.
View Details
1
Language
简体中文
English