INDEX
Explanations
numerals indicating the frequency of an action or event
phrases that express repetition or frequency of events
New Auto-Interp
Negative Logits
heirs
-0.79
Marginal
-0.75
rights
-0.72
Reviewer
-0.70
XT
-0.69
Flavoring
-0.67
yright
-0.66
atorium
-0.66
Accessory
-0.66
uld
-0.65
POSITIVE LOGITS
cale
0.92
manship
0.80
entimes
0.80
Ago
0.77
terness
0.71
coded
0.71
interstitial
0.70
TER
0.69
points
0.68
theless
0.68
Activations Density 0.028%