INDEX
Explanations
names of individuals, particularly with a focus on "Rin" and "Nico"
proper nouns, particularly names related to individuals or entities
New Auto-Interp
Negative Logits
berra
-0.85
acular
-0.68
een
-0.68
mort
-0.68
ritional
-0.63
eneg
-0.63
ers
-0.63
yer
-0.62
rait
-0.62
stroke
-0.62
POSITIVE LOGITS
··
0.83
itarian
0.83
Rin
0.81
Ig
0.81
za
0.69
ás
0.68
Roosevelt
0.68
zl
0.67
ASED
0.67
zman
0.66
Activations Density 0.046%