INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     capitals
    -0.09
    ão
    -0.08
     برخورد
    -0.08
     दृ
    -0.08
     हाद
    -0.08
     İn
    -0.08
     indes
    -0.08
    Frag
    -0.07
     interstate
    -0.07
     defeated
    -0.07
    POSITIVE LOGITS
     snail
    0.08
     salmon
    0.08
     señ
    0.08
    -grown
    0.08
    cie
    0.07
     rehabil
    0.07
     نج
    0.07
     Singapore
    0.07
     koi
    0.07
    0.07
    Act Density 0.003%

    No Known Activations