INDEX
Negative Logits
soak
-0.08
번
-0.08
engineering
-0.08
engineering
-0.07
phys
-0.07
engineers
-0.07
Sang
-0.07
+w
-0.07
lec
-0.07
kines
-0.07
POSITIVE LOGITS
掉
0.08
קרים
0.08
offending
0.08
případ
0.08
Thorn
0.07
случаев
0.07
rejects
0.07
khỏi
0.07
undesirable
0.07
запрещ
0.07
Activations Density 0.007%