INDEX
Explanations
phrases that refer to the financial implications or conclusions in a text
New Auto-Interp
Negative Logits
erken
-0.18
Ùħا
-0.16
ever
-0.16
umber
-0.16
zin
-0.15
zn
-0.15
егоÑĢ
-0.15
onne
-0.15
vä
-0.14
IENT
-0.14
POSITIVE LOGITS
most
0.24
bottom
0.22
Bottom
0.22
/top
0.21
bottom
0.20
-bottom
0.20
(bottom
0.20
.Bottom
0.19
-most
0.19
BOTTOM
0.18
Activations Density 0.017%