INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    yam
    -0.61
    s
    -0.59
    </em>
    -0.55
    ன்ன
    -0.53
    cc
    -0.50
    englisch
    -0.50
    quiv
    -0.49
    ç
    -0.49
    ra
    -0.49
    tin
    -0.48
    POSITIVE LOGITS
     IDEA
    1.29
    Idea
    1.22
     ideas
    1.22
    Ideas
    1.21
     Ideas
    1.20
     IDEAS
    1.14
    ideas
    1.14
     Idea
    1.11
    IDEA
    0.99
     inappropriés
    0.92
    Act Density 0.055%

    No Known Activations