INDEX
Explanations
names or words related to individuals such as "David" and "Hillary Clinton"
proper nouns, particularly names and identifiers related to individuals or entities
New Auto-Interp
Negative Logits
thora
-0.86
exha
-0.77
Rita
-0.75
vati
-0.73
resil
-0.73
pmwiki
-0.72
psc
-0.71
murd
-0.64
¹
-0.64
asma
-0.64
POSITIVE LOGITS
vironment
0.95
ovo
0.90
ception
0.85
heimer
0.83
iden
0.81
ciating
0.81
ly
0.80
sson
0.79
ally
0.77
structed
0.76
Activations Density 0.041%