INDEX
Explanations
contractions indicating negation or denial
negative language regarding existence or certainty
New Auto-Interp
Negative Logits
PU
-0.66
Butt
-0.61
Relationship
-0.60
Mechdragon
-0.60
Species
-0.59
Draw
-0.57
face
-0.57
contractor
-0.57
maturity
-0.57
dress
-0.56
POSITIVE LOGITS
't
1.30
iting
0.89
geon
0.88
etsk
0.88
tyard
0.86
ited
0.84
tesy
0.82
¹
0.82
ajor
0.81
nas
0.80
Activations Density 0.066%