INDEX
Explanations
names containing "Hen" or variations like "Henrique" or "Henrik"
New Auto-Interp
Negative Logits
EED
-0.79
terday
-0.73
henko
-0.68
ATIVE
-0.64
eele
-0.61
SIGN
-0.60
Reference
-0.59
ateral
-0.59
FACE
-0.59
align
-0.58
POSITIVE LOGITS
rique
1.23
riks
1.12
rik
1.10
ning
1.05
nery
1.04
sel
0.99
ricks
0.95
riot
0.94
etr
0.94
ned
0.93
Activations Density 0.025%