INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    uli
    -0.06
    	If
    -0.06
    _pe
    -0.06
    _setting
    -0.06
     contemplated
    -0.06
    рат
    -0.06
    =>
    -0.06
    ΟΝ
    -0.06
    _hidden
    -0.06
    ncmp
    -0.06
    POSITIVE LOGITS
     conc
    0.06
     vin
    0.06
    132
    0.06
    yní
    0.06
     Vin
    0.06
    ío
    0.06
     elic
    0.06
     içeren
    0.06
     appel
    0.06
     peter
    0.06
    Act Density 0.021%

    No Known Activations