INDEX
Explanations
names and titles of historical figures connected to nobility
New Auto-Interp
Negative Logits
yro
-0.16
McKay
-0.16
poil
-0.15
ILON
-0.14
Mc
-0.14
_mc
-0.14
utsch
-0.14
urai
-0.14
Mickey
-0.14
Kaiser
-0.14
POSITIVE LOGITS
Vere
0.23
Vy
0.21
Pag
0.20
Pon
0.20
Fans
0.20
Vill
0.20
Nug
0.19
Dorm
0.19
Mont
0.19
Hunger
0.19
Activations Density 0.055%