https://twitter.com/blu3mo/status/1712277274387452112

I’m curious about what would happen if we use a cheaper LLM (like gpt-3.5) for 90% of the output, and only use a superior LLM (like gpt-4) for the remaining 10% of the important parts. For example, using gpt-4 only for the first 3 characters of each sentence.