INDEX
Explanations
mentions of statistics or comparison of parts to the whole
phrases that refer to collective entities or groups
New Auto-Interp
Negative Logits
andon
-0.73
orange
-0.66
bots
-0.62
iron
-0.62
antine
-0.61
Ru
-0.60
ATURES
-0.60
Arri
-0.59
Roy
-0.59
doms
-0.59
POSITIVE LOGITS
whole
1.59
result
1.53
consequence
1.33
percentage
1.04
matter
0.95
cohesive
0.95
totality
0.94
standalone
0.90
Result
0.84
bloc
0.83
Activations Density 0.068%