Humana LLM Proof of Technology
Challenge
Humana wanted to explore the potential of large language models to support its internal copy department. Before committing to a pilot program, the organization needed an unbiased, side-by-side evaluation of LLM technologies to assess suitability for real-world copywriting tasks.
Solution
Humana engaged Deloitte to design and run a structured evaluation process. The goal was to compare three LLM platforms (ChatGPT-4o, Microsoft Copilot, and Jasper )against defined criteria relevant to Humana’s copywriting needs. At this time, Humana team members were widely unfamiliar with chatbot usage and needed end-to-end guidance through the capabilities of each technology.
My role
-
Helped select the LLM technologies to be evaluated.
-
Designed a cross-platform comparison rubric
-
Drafted prompts to be tested across all platforms.
-
Facilitated sessions with Humana copywriters, guiding them through the process of rating outputs from each model.
-
Retailored prompts and input documents to address additional benchmarks to explore mid project concerns raised by the Hive team.
Outcome
The structured evaluation provided Humana with a clear, comparative view of the strengths and limitations of each LLM platform. This analysis became the foundation for Humana’s decision-making around a focused pilot pitch.


