INDEX
Explanations
contractions of "are not" in sentences
negative contractions and forms of denial
New Auto-Interp
Negative Logits
ramid
-0.70
fox
-0.66
croft
-0.66
Mechdragon
-0.64
ochond
-0.63
PU
-0.62
Draw
-0.61
step
-0.60
duty
-0.60
owan
-0.59
POSITIVE LOGITS
't
1.25
iting
0.84
ited
0.83
ajor
0.82
etsk
0.77
tyard
0.76
ates
0.74
tesy
0.72
geon
0.72
nel
0.72
Activations Density 0.056%