INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     unlucky
    -0.07
    _daily
    -0.07
     Marty
    -0.07
     strugg
    -0.07
     blasph
    -0.07
     angry
    -0.06
     аллерг
    -0.06
    	gl
    -0.06
     مطال
    -0.06
     struggling
    -0.06
    POSITIVE LOGITS
     means
    0.10
    means
    0.08
    ms
    0.08
    ens
    0.08
    ων
    0.08
    /as
    0.08
     Means
    0.08
    MS
    0.07
     obtains
    0.07
     Von
    0.07
    Act Density 0.012%

    No Known Activations