INDEX
Explanations
words related to prediction or warning
instances of the word "fore" in various contexts
New Auto-Interp
Negative Logits
BuyableInstoreAndOnline
-0.84
REDACTED
-0.79
RED
-0.77
IRO
-0.73
å°Ĩ
-0.73
Stain
-0.70
OPLE
-0.67
Ou
-0.66
Franks
-0.65
Shed
-0.63
POSITIVE LOGITS
nsics
1.09
shadow
1.07
warn
1.02
told
1.01
father
1.01
warning
0.99
fore
0.97
runner
0.97
sight
0.96
shore
0.95
Activations Density 0.007%