INDEX
Explanations
phrases that indicate confrontation or conflict
New Auto-Interp
Head Attr Weights
0:0.02
1:0.03
2:0.06
3:0.25
4:0.01
5:0.03
6:0.07
7:0.14
8:0.05
9:0.12
10:0.06
11:0.12
Negative Logits
Helpful
-1.29
stadt
-1.28
enthal
-1.13
nor
-1.09
ERSON
-1.07
hift
-1.06
Requires
-1.05
projecting
-1.05
donald
-1.05
enhagen
-1.03
POSITIVE LOGITS
ensued
1.27
Achilles
1.16
Atlantis
1.14
emate
1.10
weed
1.09
foe
1.09
advers
1.08
Bros
1.08
Rabbit
1.07
raged
1.07
Activations Density 0.014%