Rails World is now sold out - See you soon in Amsterdam!
LLM Evaluations & Reinforcement Learning for Shopify Sidekick on Rails
This talk explores building production LLM systems through Shopify Sidekick’s Rails architecture, covering orchestration patterns and tool integration strategies. We’ll establish statistically rigorous LLM-based evaluation frameworks that move beyond subjective “vibe testing.” Finally, we’ll demonstrate how robust evaluation systems become critical infrastructure for reinforcement learning pipelines, while exploring how RL can learn to hack evaluations and strategies to mitigate this.
Friday September 5 / 15:45 - 16:15 / Track 1

Andrew Mcnamara
Director Applied ML, Shopify

Charlie Lee
Staff Engineer, Shopify