INDEX
Explanations
proper nouns likely related to locations or people
names of individuals, particularly focusing on their frequent mentions in the text
New Auto-Interp
Negative Logits
acular
-0.78
rall
-0.70
ition
-0.68
pmwiki
-0.65
orative
-0.63
llular
-0.63
uality
-0.62
downhill
-0.61
âĶĢâĶĢ
-0.61
ISION
-0.61
POSITIVE LOGITS
idays
0.81
aida
0.78
wig
0.76
riks
0.75
mann
0.74
sworth
0.74
mong
0.73
aday
0.71
felt
0.70
ancock
0.70
Activations Density 0.157%