INDEX
Explanations
names of specific individuals
proper nouns, particularly names of individuals or places
New Auto-Interp
Negative Logits
ively
-0.93
ion
-0.93
iveness
-0.91
aired
-0.89
elf
-0.84
ivity
-0.84
ional
-0.82
lie
-0.80
ework
-0.78
vest
-0.78
POSITIVE LOGITS
hyde
0.84
Hasan
0.72
Penal
0.71
uay
0.71
Canaver
0.69
Cabrera
0.66
cember
0.66
Bane
0.66
Williamson
0.65
Wast
0.64
Activations Density 0.012%