INDEX
Explanations
phrases that emphasize contradiction or irony
New Auto-Interp
Negative Logits
ListItemIcon
-0.55
Ac
-0.51
fiés
-0.49
Santis
-0.47
اعم
-0.47
lis
-0.46
tủ
-0.45
المكان
-0.44
noons
-0.44
ISupport
-0.43
POSITIVE LOGITS
words
1.65
words
1.26
Words
1.23
Words
1.22
WORDS
1.20
słowa
1.12
word
1.10
woorden
1.08
palabras
1.06
WORDS
1.01
Activations Density 0.193%