INDEX
Explanations
italicized words
punctuation and structural elements within sentences
New Auto-Interp
Negative Logits
enhagen
-0.73
INTON
-0.69
scrut
-0.69
quickShipAvailable
-0.67
AMP
-0.64
emort
-0.63
Accountability
-0.62
unintention
-0.61
cryst
-0.61
TPS
-0.61
POSITIVE LOGITS
sic
1.03
ensis
0.89
tnc
0.77
dust
0.74
edu
0.70
mun
0.69
literally
0.68
oult
0.68
rosis
0.68
utils
0.67
Activations Density 0.871%