INDEX
Negative Logits
U
0.90
Audience
0.86
Appl
0.84
Å
0.84
Nutzer
0.83
Área
0.83
jed
0.80
Adaptive
0.78
N
0.78
Enseñanza
0.78
POSITIVE LOGITS
conquered
0.72
fabricants
0.68
犸
0.68
ᱼ
0.66
carène
0.65
ังหว
0.64
soared
0.64
tormented
0.64
राना
0.63
斯拉
0.62
Activations Density 0.533%