INDEX
Explanations
names of people
proper nouns, especially names
New Auto-Interp
Negative Logits
itaire
-0.89
Gibraltar
-0.79
akia
-0.79
Horror
-0.75
Trident
-0.73
Bethlehem
-0.71
ampton
-0.71
rapt
-0.70
abies
-0.70
role
-0.70
POSITIVE LOGITS
Nguyen
1.24
Huang
1.22
Choi
1.21
Chao
1.02
Ng
1.01
Fei
1.00
Chu
0.94
Xiang
0.92
Chung
0.91
guyen
0.90
Activations Density 0.011%