INDEX
Explanations
proper names or terms beginning with 'Na'
the mention of the name "Na" or variations of it
New Auto-Interp
Negative Logits
eering
-0.85
tips
-0.84
dress
-0.79
Journals
-0.77
ttes
-0.75
papers
-0.69
geist
-0.69
lords
-0.69
*/(
-0.68
hetti
-0.68
POSITIVE LOGITS
elson
1.00
uthor
0.95
ïve
0.95
iber
0.93
omi
0.91
Äį
0.89
Na
0.89
vel
0.89
Ni
0.88
emonic
0.88
Activations Density 0.007%