INDEX
Explanations
phrases related to critical assessments or consequences
New Auto-Interp
Negative Logits
inorder
-0.17
thereby
-0.16
à¹Ģà¸ŀ
-0.16
afin
-0.15
then
-0.15
nhằm
-0.15
using
-0.15
instead
-0.15
vice
-0.14
eca
-0.14
POSITIVE LOGITS
especially
0.29
especially
0.24
particularly
0.22
indeed
0.21
especialmente
0.21
even
0.20
å°¤
0.20
Especially
0.20
pecially
0.20
оÑģобенно
0.20
Activations Density 0.328%