INDEX
Negative Logits
orex
-0.08
하게
-0.07
.communication
-0.06
مار
-0.06
似乎
-0.06
segregation
-0.06
_PRIMARY
-0.06
Setup
-0.06
Path
-0.06
meu
-0.06
POSITIVE LOGITS
connect
0.07
ườ
0.07
_reader
0.07
spoof
0.07
080
0.07
-building
0.06
separat
0.06
.userid
0.06
cof
0.06
.cover
0.06
Activations Density 0.050%