INDEX
Explanations
phrases or sentences indicating opinions or declarations, often starting with "said it was."
repeated phrases indicating assertion or confirmation
New Auto-Interp
Negative Logits
confines
-0.74
enment
-0.71
alus
-0.67
ossom
-0.67
aign
-0.65
acly
-0.65
stroll
-0.61
atives
-0.61
vertisement
-0.60
fts
-0.60
POSITIVE LOGITS
owed
0.75
©¶æ
0.71
not
0.70
absolutely
0.70
uten
0.67
Ĥİ
0.67
"#
0.66
NOT
0.66
intentionally
0.65
not
0.65
Activations Density 0.370%