INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ia
    1.24
    ie
    1.23
    ii
    1.21
    ios
    1.20
    oe
    1.19
    ien
    1.17
    i
    1.17
    iene
    1.15
    iin
    1.13
    ies
    1.13
    POSITIVE LOGITS
    ńskiego
    0.77
    addEnemy
    0.75
    ACIONES
    0.74
     hormati
    0.71
    0.70
     kształ
    0.69
    0.68
     généraux
    0.68
     déchets
    0.68
     گردد
    0.68
    Act Density 0.001%

    No Known Activations