INDEX
Explanations
proper nouns related to individuals
the repetition of the name "Henrik" in various contexts
New Auto-Interp
Negative Logits
terday
-0.76
EED
-0.73
conclud
-0.72
eele
-0.69
anwhile
-0.65
ENTS
-0.65
Sax
-0.64
ENCY
-0.62
Reference
-0.62
ateral
-0.61
POSITIVE LOGITS
rique
1.26
ning
1.03
lein
0.98
riks
0.95
sel
0.95
rik
0.93
agar
0.92
kel
0.90
ricks
0.90
emy
0.90
Activations Density 0.017%