INDEX
Explanations
statements emphasizing the word "fact" and its variations
New Auto-Interp
Negative Logits
SiO
-0.54
aimez
-0.53
DIS
-0.47
erez
-0.45
SwitchCompat
-0.44
itolo
-0.44
Uni
-0.43
BAM
-0.43
करें
-0.43
masing
-0.43
POSITIVE LOGITS
indeed
1.30
indeed
1.29
fact
1.23
Indeed
1.20
Indeed
1.18
anzi
1.13
Bahkan
1.03
Bahkan
1.03
Infatti
1.03
事实上
1.01
Activations Density 0.258%