INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     purification
    -0.08
     importancia
    -0.08
     про
    -0.08
    тарының
    -0.08
     viva
    -0.08
     التط
    -0.08
     kissing
    -0.08
    ط
    -0.07
    مم
    -0.07
    ยอด
    -0.07
    POSITIVE LOGITS
     overflow
    0.11
    Overflow
    0.10
     Overflow
    0.10
    overflow
    0.09
     Socialist
    0.08
    ahas
    0.08
     silently
    0.08
     Flags
    0.08
     Hinweise
    0.07
     Workbook
    0.07
    Act Density 0.002%

    No Known Activations