INDEX
Negative Logits
modify
-0.08
ват
-0.06
captivity
-0.06
actually
-0.06
wolf
-0.06
sarcast
-0.06
Sesso
-0.06
bazen
-0.06
共
-0.06
configurations
-0.06
POSITIVE LOGITS
able
0.08
le
0.07
Veterinary
0.06
aft
0.06
orable
0.06
Lbl
0.06
Simple
0.06
_LSB
0.06
méd
0.06
dl
0.06
Activations Density 0.006%