INDEX
    Explanations

    questions after prompts

    New Auto-Interp
    Negative Logits
    Fear
    0.47
    0.47
     Gently
    0.45
    Founded
    0.45
     Fear
    0.45
     presente
    0.45
     Founded
    0.45
     Gente
    0.43
     gently
    0.42
    Introduce
    0.42
    POSITIVE LOGITS
    izzato
    0.48
    重要な
    0.48
    <unused2173>
    0.47
     중요한
    0.47
     llrp
    0.47
    endment
    0.47
     Donovan
    0.46
     importantes
    0.46
    isations
    0.45
     нужны
    0.45
    Act Density 0.001%

    No Known Activations