INDEX
Explanations
references to people, specifically the name "Smith" and related proper nouns
New Auto-Interp
Negative Logits
iper
-0.15
fen
-0.15
ÑĨÑĸй
-0.15
suất
-0.15
raud
-0.15
rious
-0.14
inqu
-0.14
aways
-0.14
IP
-0.14
acak
-0.14
POSITIVE LOGITS
sonian
0.45
wick
0.40
ers
0.31
ies
0.30
son
0.30
ere
0.24
SON
0.21
ells
0.21
urst
0.21
aller
0.20
Activations Density 0.011%