INDEX
Explanations
mentions of making bets or expressing confidence in future outcomes
references to making predictions or placing bets
New Auto-Interp
Negative Logits
ĸļ
-0.81
Flavoring
-0.78
nesota
-0.70
pmwiki
-0.69
ocumented
-0.68
VILLE
-0.67
ISTER
-0.66
Blaze
-0.65
umatic
-0.65
ovie
-0.64
POSITIVE LOGITS
hesda
1.09
bets
1.04
terson
1.01
bet
0.93
ting
0.86
ters
0.85
roth
0.83
peg
0.83
ron
0.79
tery
0.78
Activations Density 0.011%