INDEX
    Explanations

    turn to phrases and prompts

    New Auto-Interp
    Negative Logits
     Olimpi
    0.84
    accumulation
    0.82
     ومن
    0.82
    ammer
    0.81
    Panoramic
    0.80
     منهج
    0.79
    土産
    0.78
     Découvrez
    0.78
    Referências
    0.77
    <unused1639>
    0.77
    POSITIVE LOGITS
     prompts
    0.80
     prompted
    0.72
    ishing
    0.68
     prompt
    0.68
     cor
    0.67
     cl
    0.66
    стные
    0.64
     oh
    0.63
     heads
    0.63
    prompt
    0.62
    Act Density 0.000%

    No Known Activations