Login
Download
Skill UI
Browse and discover
5019+
curated skills
All
Development
Artificial Intelligence
Design & Creative
Product & Business
Data Science
Marketing
Soft Skills
Productivity
Engineering
Languages
Search
RLHF
, found
3
results
Default
Newest
Most Downloaded
TRL Reinforcement Fine-Tuning
fine-tuning-with-trl
Orchestra-Research/AI-Research-SKILLs
240
Provides TRL-based RLHF fine-tuning flows covering SFT, reward-model training, PPO, DPO, and GRPO so teams can align HuggingFace models with preferences using both pipeline scripts and CLI helpers.
View Details
OpenRLHF High Performance Training
openrlhf-training
Orchestra-Research/AI-Research-SKILLs
344
OpenRLHF is a Ray-based RLHF framework that accelerates large-model PPO, GRPO, RLOO, and DPO training with vLLM inference and ZeRO-3 resource sharing, targeting distributed GPU clusters handling 7B-70B+ parameter models.
View Details
Verl RL Training
verl-rl-training
Orchestra-Research/AI-Research-SKILLs
486
Guides RLHF-style LLM fine-tuning with verl's HybridFlow stack, scaling PPO/GRPO/DAPO across FSDP/Megatron/vLLM backends for math reasoning, agentic multiturn, and vision-language workflows.
View Details
1
Language
简体中文
English