INDEX
Explanations
specific names, likely proper nouns
names of individuals and characters associated with significant events or contexts
New Auto-Interp
Negative Logits
journal
-0.85
nerv
-0.69
saline
-0.68
arenthood
-0.68
governors
-0.67
igators
-0.65
izoph
-0.65
oxy
-0.63
captcha
-0.63
ggy
-0.62
POSITIVE LOGITS
Curtis
0.80
Compton
0.78
hift
0.74
Yar
0.74
Dixon
0.72
Armstrong
0.69
McKenzie
0.68
aba
0.68
ensor
0.67
ãĤ¼ãĤ¦ãĤ¹
0.67
Activations Density 0.010%