INDEX
Explanations
phrases that refer to groups and their characteristics
New Auto-Interp
Head Attr Weights
0:0.11
1:0.06
2:0.01
3:0.13
4:0.10
5:0.13
6:0.06
7:0.04
8:0.19
9:0.08
10:0.01
11:0.02
Negative Logits
tainted
-2.04
merged
-1.94
electronically
-1.87
digitally
-1.80
arsen
-1.79
TNT
-1.73
emitting
-1.72
tint
-1.70
VS
-1.69
tampering
-1.67
POSITIVE LOGITS
rities
2.77
arters
2.35
abies
2.18
ynes
2.17
erning
1.98
iences
1.97
Case
1.93
ptions
1.93
forts
1.90
ividual
1.89
Activations Density 0.000%