INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.06
    ,obj
    -0.06
     проведения
    -0.06
    	ans
    -0.06
     announced
    -0.06
     "}↵
    -0.06
     slamming
    -0.06
     подраз
    -0.06
     Aug
    -0.06
     gören
    -0.06
    POSITIVE LOGITS
    ipples
    0.09
    icken
    0.08
    ifa
    0.08
    ifo
    0.08
    isel
    0.08
    itone
    0.08
    isson
    0.08
    450
    0.07
    CCA
    0.07
    ivar
    0.07
    Act Density 0.323%

    No Known Activations