INDEX
    Explanations

    don't, isn't, wasn't, it's

    New Auto-Interp
    Negative Logits
    വ്
    0.75
    w
    0.75
    går
    0.75
    wap
    0.74
    色が
    0.74
    meye
    0.70
    ları
    0.69
     byla
    0.68
    verkehr
    0.68
    abilir
    0.68
    POSITIVE LOGITS
     unprecedented
    0.90
     reiterate
    0.85
     uncomfortable
    0.83
     uneasy
    0.82
     trembling
    0.82
     то
    0.80
     Timothy
    0.79
     indispensable
    0.78
     reiter
    0.78
     shaking
    0.77
    Act Density 0.118%

    No Known Activations