INDEX
Explanations
mentions of the word "for" occurring in various contexts
references to the word "For."
New Auto-Interp
Negative Logits
inese
-0.65
orically
-0.60
upon
-0.59
Jr
-0.59
mare
-0.58
Burgess
-0.56
vom
-0.55
âĶ
-0.54
801
-0.53
icago
-0.53
POSITIVE LOGITS
bidden
1.50
gotten
1.48
geries
1.19
gery
1.19
ked
1.11
example
1.08
gettable
1.06
instance
1.02
cers
1.02
cing
1.01
Activations Density 0.136%