INDEX
Explanations
names of prominent individuals, particularly in political contexts
New Auto-Interp
Head Attr Weights
0:0.03
1:0.04
2:0.16
3:0.16
4:0.02
5:0.02
6:0.09
7:0.08
8:0.06
9:0.10
10:0.07
11:0.13
Negative Logits
nikov
-1.41
merga
-1.26
anwhile
-1.22
conserv
-1.20
orks
-1.15
avering
-1.12
TPS
-1.11
orah
-1.08
defense
-1.07
bara
-1.06
POSITIVE LOGITS
Ramos
1.16
Kemp
1.15
Became
1.14
Cummings
1.09
Rollins
1.09
Jonas
1.09
Hamm
1.06
Robinson
1.02
zos
1.02
iac
1.01
Activations Density 0.007%