INDEX
Explanations
contractions of words, especially those expressing possession like "don't" and "´"
unusual or distinctive punctuation marks
New Auto-Interp
Negative Logits
acknowled
-0.83
iceberg
-0.71
congr
-0.70
aristocracy
-0.69
congratulations
-0.67
ATURES
-0.67
acknow
-0.66
lehem
-0.66
ashes
-0.66
artifacts
-0.66
POSITIVE LOGITS
Tex
0.78
-|
0.74
elle
0.72
_>
0.70
Da
0.70
olit
0.69
Daddy
0.69
SEC
0.68
Coach
0.67
ë
0.67
Activations Density 0.000%