INDEX
Explanations
phrases and words indicating causation or influence
New Auto-Interp
Negative Logits
pigeon
-0.78
Timber
-0.72
Methodist
-0.66
conservancy
-0.64
Alto
-0.63
staples
-0.62
quarters
-0.60
Armenian
-0.60
Dud
-0.60
croft
-0.59
POSITIVE LOGITS
iveness
1.38
ual
1.24
uated
1.23
uating
1.16
uation
1.13
ually
1.11
uate
1.08
uates
1.07
ively
1.06
uality
1.00
Activations Density 0.028%