INDEX
Explanations
words related to sports scoring
instances of the word "to" associated with actions or events
New Auto-Interp
Negative Logits
censored
-0.80
postings
-0.76
pornographic
-0.76
minist
-0.76
censorship
-0.75
orno
-0.75
postage
-0.75
testimonies
-0.73
interf
-0.73
passwords
-0.72
POSITIVE LOGITS
ggles
1.08
finish
1.01
scrimmage
0.99
earn
0.92
pload
0.92
congratulate
0.89
celebrate
0.86
propel
0.85
venge
0.84
improve
0.83
Activations Density 0.190%