Rails World is now sold out - See you soon in Amsterdam!

LLM Evaluations & Reinforcement Learning for Shopify Sidekick on Rails

This talk explores building production LLM systems through Shopify Sidekick’s Rails architecture, covering orchestration patterns and tool integration strategies. We’ll establish statistically rigorous LLM-based evaluation frameworks that move beyond subjective “vibe testing.” Finally, we’ll demonstrate how robust evaluation systems become critical infrastructure for reinforcement learning pipelines, while exploring how RL can learn to hack evaluations and strategies to mitigate this.

clock icon

Friday September 5 / 15:45 - 16:15 / Track 1

Andrew Mcnamara's photo

Andrew Mcnamara

Director Applied ML, Shopify

Charlie Lee's photo

Charlie Lee

Staff Engineer, Shopify