INDEX
Explanations
expressions indicating negation or warnings related to betting strategies
New Auto-Interp
Negative Logits
ackbar
-0.16
sond
-0.15
raid
-0.15
N
-0.15
asser
-0.14
Ñģклад
-0.14
oen
-0.14
rof
-0.14
kün
-0.14
alcohol
-0.14
POSITIVE LOGITS
uffs
0.17
cke
0.16
964
0.16
burger
0.15
atak
0.15
اÙĩÛĮ
0.15
tab
0.14
ÏĮÏģ
0.14
ÑĥÑĢ
0.14
éŃĶæ³ķ
0.14
Activations Density 0.041%