INDEX
Explanations
phrases related to winning and losing outcomes
New Auto-Interp
Negative Logits
160
-0.16
put
-0.15
uct
-0.15
oto
-0.15
157
-0.15
izza
-0.15
ott
-0.15
Orm
-0.14
PUT
-0.14
heed
-0.14
POSITIVE LOGITS
iteral
0.16
ForKey
0.16
amon
0.15
Kraj
0.14
agli
0.14
lassen
0.14
/effects
0.14
ave
0.14
ocker
0.13
ressing
0.13
Activations Density 0.238%