INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     kwe
    -0.08
     lect
    -0.08
    Imag
    -0.07
    -0.07
    -0.07
    istency
    -0.07
    istent
    -0.07
     urine
    -0.07
     IVA
    -0.07
     Sas
    -0.07
    POSITIVE LOGITS
    abyte
    0.09
     olmayan
    0.08
    شنبه
    0.08
    unchecked
    0.08
    classe
    0.07
     varit
    0.07
     కాక
    0.07
    íos
    0.07
     πρώτο
    0.07
     dzień
    0.07
    Act Density 0.027%

    No Known Activations