INDEX
    Explanations

    instances where the word "doesn't" is included

    negations or words indicating absence

    New Auto-Interp
    Negative Logits
    lehem
    -0.77
    odon
    -0.70
    igers
    -0.65
    ende
    -0.65
    udo
    -0.63
     chin
    -0.63
    zel
    -0.63
    ghazi
    -0.63
    iger
    -0.63
    fall
    -0.62
    POSITIVE LOGITS
    cean
    0.81
     charact
    0.72
    iggurat
    0.67
    atell
    0.65
     corrid
    0.65
    ospons
    0.62
     axis
    0.62
    hyde
    0.62
    ajo
    0.62
    ãĢİ
    0.61
    Act Density 0.000%

    No Known Activations