INDEX
Explanations
variations or differences in items or categories
New Auto-Interp
Negative Logits
nothing
-0.70
ISTORY
-0.69
WARD
-0.68
Never
-0.67
OIL
-0.67
Behind
-0.66
Ĭ
-0.64
never
-0.64
emp
-0.64
Alert
-0.64
POSITIVE LOGITS
iating
2.03
kinds
1.84
iates
1.72
types
1.65
iations
1.62
ials
1.58
versions
1.39
iator
1.36
iated
1.34
aspects
1.29
Activations Density 0.071%