INDEX
Explanations
negative statements or denials
New Auto-Interp
Head Attr Weights
0:0.07
1:0.09
2:0.07
3:0.07
4:0.08
5:0.07
6:0.07
7:0.15
8:0.07
9:0.07
10:0.08
11:0.07
Negative Logits
Mord
-1.62
Saur
-1.58
______
-1.58
McMahon
-1.55
Belarus
-1.52
Computing
-1.52
Yuri
-1.50
mater
-1.50
DOD
-1.50
Persia
-1.49
POSITIVE LOGITS
erity
2.25
icion
2.01
apixel
1.89
itional
1.84
ullivan
1.84
nightly
1.79
letcher
1.77
escription
1.75
ancies
1.75
ailed
1.75
Activations Density 0.000%