INDEX
Explanations
titles and roles associated with professionals and public figures
New Auto-Interp
Negative Logits
itled
-0.17
Configurer
-0.16
Able
-0.15
ssf
-0.15
hiba
-0.15
issen
-0.14
stricted
-0.14
kker
-0.14
imest
-0.14
phia
-0.14
POSITIVE LOGITS
best
0.27
widely
0.25
often
0.23
best
0.23
based
0.21
frequently
0.21
better
0.20
now
0.20
with
0.20
long
0.20
Activations Density 0.121%