INDEX
    Explanations

    response and what follows

    New Auto-Interp
    Negative Logits
     
    0.60
    ha
    0.49
    cp
    0.46
     on
    0.45
     Trabaj
    0.45
     Are
    0.45
    coating
    0.43
     sierpnia
    0.43
    li
    0.42
     cla
    0.42
    POSITIVE LOGITS
    Roth
    0.49
    ش
    0.49
     Roth
    0.48
    šku
    0.46
     уйнагыз
    0.46
     expansión
    0.46
     izquier
    0.45
    PopMatrix
    0.45
    0.45
     admissibility
    0.45
    Act Density 0.000%

    No Known Activations