INDEX
Explanations
phrases that express skepticism or challenge common beliefs
New Auto-Interp
Negative Logits
however
-0.21
However
-0.17
åį´
-0.16
jedoch
-0.16
HOWEVER
-0.15
åĪĻ
-0.15
اÙĥÙĨ
-0.15
nevertheless
-0.15
smarty
-0.14
éal
-0.14
POSITIVE LOGITS
?
0.16
importantly
0.16
è¿Ľä¸ĢæŃ¥
0.15
nữa
0.15
,
0.15
equally
0.14
!
0.14
odash
0.14
-Za
0.14
otto
0.14
Activations Density 0.325%