INDEX
Explanations
proper names, particularly those of people
New Auto-Interp
Negative Logits
Bulgar
-0.68
士
-0.67
Archdemon
-0.66
imeters
-0.66
ashtra
-0.65
Bulgarian
-0.64
Vengeance
-0.63
Letter
-0.62
ç¥ŀ
-0.60
acea
-0.60
POSITIVE LOGITS
rentice
0.87
zik
0.81
enhagen
0.75
rees
0.70
rick
0.68
isch
0.68
isson
0.66
aylor
0.64
annon
0.64
elson
0.63
Activations Density 0.021%