from [/villagepump/フラクタル要約: Llama 3 Implementation](https://scrapbox.io/villagepump/フラクタル要約: Llama 3 Implementation) from FractalReader Operation Diary Fractal Summary: Implementation of Llama 3

  • LLM Pricing
  • It was super easy to implement using together.ai (blu3mo)(blu3mo)
    • Since together.ai is compatible with the OpenAI API, by changing the endpoint to together.ai and renaming the model, Llama 3 70B can run
  • English version: https://fractal-reader.com/view/d6db0839-81d1-4bd9-bfbb-1250fe163c88
  • Japanese version: https://fractal-reader.com/view/475c8c5a-3096-4830-80cb-25edc33a2ade
    • Hmm, maybe it’s not as bad as expected (blu3mo)(blu3mo)
    • Probably failing at Level 1 due to exceeding the token limit
      • Need to think about how to handle this
  • The price is about 1/10th
  • Well, upon closer inspection, some summary requests are failing
    • There are cases where correct JSON is not being generated
    • Perhaps gpt-4 is more stable in that regard
  • There might be a mechanism where if Llama fails, the request goes to gpt-4-turbo

Trying Qwen 1.5 as well

  • This one has a token input length of 30k so you can input longer text
  • image
    • Asked for Japanese but got Chinese instead…
  • Qwen, while having high quality, doesn’t always follow the prompt instructions
  • Also, the output is slow
  • There’s a reason why cheap models are cheap, it’s quite challenging to control and has a lot of noise
    • Once again realizing the strength of gpt-4-turbo
    • Nevertheless, the quality of open models is impressive
  • Want to Compare Low-Cost Open Models with GPT