INDEX
Explanations
phrases related to processes or actions
New Auto-Interp
Head Attr Weights
0:0.02
1:0.14
2:0.02
3:0.05
4:0.03
5:0.03
6:0.06
7:0.02
8:0.02
9:0.51
10:0.02
11:0.02
Negative Logits
Flores
-3.55
mole
-3.52
Kemp
-3.44
ayan
-3.28
aches
-3.28
aya
-3.19
anka
-3.17
oles
-2.99
aun
-2.97
hw
-2.96
POSITIVE LOGITS
3
4.88
3
4.87
03
3.94
03
3.85
III
3.75
2003
3.74
303
3.71
cous
3.65
393
3.52
2003
3.52
Activations Density 0.165%