INDEX
Explanations
phrases or words referring to variety or differences
New Auto-Interp
Negative Logits
âĨij
-0.68
WI
-0.67
WARD
-0.64
HP
-0.62
OIL
-0.61
âĢł
-0.60
Never
-0.60
AFTA
-0.58
Trivia
-0.57
Wilde
-0.56
POSITIVE LOGITS
iating
1.83
iates
1.60
iator
1.37
iations
1.37
ials
1.36
iated
1.22
kinds
1.18
iable
1.11
iation
1.08
iate
1.07
Activations Density 0.436%