INDEX
Negative Logits
ideologies
0.98
préparer
0.96
philosophies
0.95
философ
0.93
допомогти
0.93
programas
0.93
özel
0.93
他人
0.92
habilidades
0.91
cadeau
0.90
POSITIVE LOGITS
maximum
0.92
Observed
0.90
approximated
0.88
spurious
0.87
Maximum
0.87
Maximum
0.85
detectable
0.85
detected
0.85
approximate
0.82
maksimum
0.82
Activations Density 0.176%