INDEX
Explanations
mentions of the number of people in a group or setting
New Auto-Interp
Negative Logits
DERR
-0.73
harm
-0.65
ainer
-0.64
acts
-0.64
iph
-0.63
lik
-0.61
ç¥ŀ
-0.61
reads
-0.60
ocrates
-0.59
effects
-0.59
POSITIVE LOGITS
alliance
0.84
delegation
0.83
affair
0.83
ensemble
0.76
squad
0.76
lineup
0.75
backlog
0.75
orchestra
0.75
mble
0.73
army
0.73
Activations Density 0.057%