INDEX
Explanations
terms related to categorization or classification processes
terms related to categorization and classification
New Auto-Interp
Negative Logits
vous
-0.76
perty
-0.72
elin
-0.72
orah
-0.68
homeland
-0.67
Tah
-0.67
ful
-0.67
rentice
-0.66
vig
-0.65
athi
-0.63
POSITIVE LOGITS
anguage
0.89
classify
0.86
REDACTED
0.83
ifications
0.83
ostic
0.82
ifier
0.82
categor
0.81
ically
0.81
ourgeois
0.80
rill
0.79
Activations Density 0.038%