INDEX
Explanations
references to the name "Simons."
New Auto-Interp
Head Attr Weights
0:0.01
1:0.04
2:0.10
3:0.25
4:0.02
5:0.02
6:0.10
7:0.13
8:0.03
9:0.07
10:0.07
11:0.11
Negative Logits
hydra
-1.10
sweat
-0.97
tackle
-0.92
PLA
-0.92
symb
-0.91
ankles
-0.91
podium
-0.90
footprints
-0.90
proxies
-0.90
poles
-0.88
POSITIVE LOGITS
oute
1.27
atta
1.21
ull
1.19
olic
1.18
pei
1.17
oulos
1.16
reck
1.15
ourning
1.13
opol
1.13
bilt
1.13
Activations Density 0.008%