INDEX
Explanations
modifiers for sorting or categorization of data
phrases indicating categorization and sorting by specific criteria
New Auto-Interp
Negative Logits
hend
-0.78
iron
-0.77
ãĤ£
-0.71
rats
-0.70
DEF
-0.70
uala
-0.69
aphael
-0.69
kee
-0.69
lings
-0.68
hern
-0.68
POSITIVE LOGITS
nationality
1.31
ethnicity
1.28
gender
1.25
geography
1.25
geographic
1.19
geographical
1.16
category
1.10
age
1.07
denomination
1.05
genders
1.04
Activations Density 0.161%