• Ideas that came up during a conversation with ritar.
  • On 5/18, while talking with tskokmt, I rediscovered its interest.
    • The first prototype of “OBJECT,” a musical instrument that plays visuals created by @taito_hasegawa and me, is equipped with patterns designed by Dutch graphic designer Karel Martens. These patterns pay homage to past outstanding visuals by sampling them, allowing our generation to interact and play them#modulepoeticcode pic.twitter.com/SFJ01oRtex

    • @(TSKOKMT) [March 24, 2024](https://twitter.com/TSKOKMT/status/1771910114527826423?ref_src=twsrc%5Etfw)
      
      • If there is a “musical instrument that plays visuals,” it would be nice to have a “musical instrument that plays text” as well.

Concrete image at present:

  • A metronome is ticking.

  • Text is displayed on the screen, with one character added per beat.

  • If left alone, LLM will continuously extend the text as it pleases.

  • Press buttons like playing drums to control the output.

    • Pressing the “a” button restricts the next character to the “a” line.
      • a, ka, sa, ta, na, ha, ma, ya, ra, wa
    • Pressing the “k” button restricts the next character to the “ka” line.
      • Imagine a sound like “ka.”
  • Using this, press keys rhythmically as text is generated.

    • For example, pressing “a” every four beats generates a text that rhymes in the “a” line.
    • Pressing “t,” “k,” “d,” etc., at the right timing allows you to create a beat with consonants.
  • It should be possible to control things like “output two characters per beat at this timing” or “insert a space at this timing.”

    • Necessary for rhythm control.
  • It should also be possible to control the generation of text based on the desired meaning or mood.

    • Adjusting the base output prompt in real-time.
  • Similar to a VJ or DJ, it seems possible to improvise and continuously create poetry and lyrics (blu3mo)(blu3mo)(blu3mo).

  • After reading about rhyming in [this article](https://scrapbox.io/frog96lab/押韻について 非母音主義、押韻の距離感、汎韻), I have a clearer image. It would be fun to play around with this with Frocro-san (blu3mo)(blu3mo).

  • Implementation:

    • The slowness of LLM is an issue.
      • Solution: Always have LLM output the next characters speculatively.
        • Generate characters in advance to cover all possible constraint patterns.
        • If there are n patterns of constraints and generating k characters in advance, then n^k extra generations are required.
        • Use it extravagantly.

Other ideas:

  • Sampling phrases can be exciting.
    • It adds intensity when a familiar phrase suddenly appears.
  • Mixing:
    • Mixing words was something mentioned by [ritar] before.

For a rough prototype (blu3mo):

  • Instead of focusing on creating coherent sentences, how about creating a sequence of random words?
    • Experiment with just the sound to test if it’s interesting.

References:

Report on the system of the Rap AI being developed in the Mitou project- In a rap generation technique that considers rhyming, a method using reverse generation was adopted. By using a Transformer Encoder-Decoder, rap lyrics with rhymes at the end are generated through backward text generation. Furthermore, for the verse generation system that considers answers, training data representing the correspondence of answers was created, and a custom evaluation function was developed by adding a filter.

- I see.
- If it's already determined to rhyme at the end, then this is great.
- I wonder if it would be possible to find an easier way with a strong LLM in 2024 (blu3mo).