INDEX
Negative Logits
ersive
-0.70
contam
-0.64
surpr
-0.63
OGR
-0.61
WER
-0.61
osate
-0.59
ertodd
-0.59
condem
-0.58
derog
-0.58
urer
-0.57
POSITIVE LOGITS
successive
0.68
individually
0.65
sorts
0.65
respective
0.58
phases
0.57
frames
0.57
Georgian
0.56
course
0.56
ses
0.55
Realms
0.55
Activations Density 9.394%