INDEX
Explanations
terms related to winning or success in competitive contexts
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.09
3:0.07
4:0.09
5:0.08
6:0.08
7:0.08
8:0.06
9:0.08
10:0.08
11:0.08
Negative Logits
��極
-2.35
amily
-2.22
女
-2.18
enthusi
-2.18
Karachi
-2.09
�
-2.07
umi
-2.05
sqor
-2.05
田
-2.00
Libyan
-1.97
POSITIVE LOGITS
scissors
2.29
simul
2.25
pron
2.02
bows
2.02
inates
2.00
chalk
1.98
hops
1.95
foss
1.94
veto
1.93
Bake
1.92
Activations Density 0.000%