INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     magia
    -0.07
    -0.07
     Conflict
    -0.07
    koa
    -0.07
     Abuse
    -0.07
    225
    -0.07
    (mult
    -0.07
     kif
    -0.06
     confusion
    -0.06
     cib
    -0.06
    POSITIVE LOGITS
    verzekering
    0.08
    žių
    0.08
     subsidi
    0.08
    -padding
    0.08
     ζω
    0.07
    زش
    0.07
    žio
    0.07
    Luong
    0.07
    tructor
    0.07
     ფას
    0.07
    Act Density 0.000%

    No Known Activations