INDEX
Explanations
phrases related to division into categories or parts
phrases indicating division or categorization
New Auto-Interp
Negative Logits
press
-0.77
iron
-0.75
aya
-0.74
dar
-0.72
eus
-0.68
gery
-0.67
nz
-0.64
inem
-0.63
WT
-0.63
efe
-0.63
POSITIVE LOGITS
categories
1.27
manageable
1.23
subsections
1.18
separate
1.17
tiers
1.16
sections
1.15
phases
1.15
halves
1.13
thirds
1.12
smaller
1.11
Activations Density 0.096%