INDEX
Explanations
verbs and phrases that suggest urgency or a rapid response
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.05
3:0.06
4:0.09
5:0.02
6:0.08
7:0.39
8:0.03
9:0.03
10:0.10
11:0.06
Negative Logits
alty
-1.69
Reserved
-1.54
watching
-1.45
paying
-1.43
listed
-1.42
onom
-1.41
gb
-1.39
angered
-1.35
partName
-1.34
terday
-1.31
POSITIVE LOGITS
Horses
1.57
sprint
1.56
precip
1.56
derail
1.52
hills
1.52
horses
1.50
feats
1.50
horse
1.50
takedown
1.46
turnaround
1.45
Activations Density 0.000%