INDEX
Explanations
names of a specific person, "Henrik"
mentions of the name "Hen" in various contexts
New Auto-Interp
Negative Logits
EED
-0.79
terday
-0.72
align
-0.62
eele
-0.61
ATIVE
-0.61
henko
-0.60
conclud
-0.60
FACE
-0.60
SIGN
-0.58
ignment
-0.57
POSITIVE LOGITS
rique
1.30
riks
1.17
rik
1.12
ning
1.08
ricks
0.98
sel
0.95
riot
0.95
nery
0.93
etr
0.93
emy
0.93
Activations Density 0.026%