INDEX
Negative Logits
ist
0.47
process
0.43
upgrade
0.42
uncertain
0.42
yg
0.41
da
0.40
state
0.40
arnell
0.40
inas
0.40
iled
0.39
POSITIVE LOGITS
than
0.63
niż
0.63
paradigms
0.57
altogether
0.55
Than
0.53
전혀
0.53
種類の
0.50
ніж
0.50
manière
0.49
berbeda
0.49
Activations Density 0.103%