INDEX
Explanations
references to individuals or groups, particularly in the context of judgment or action
New Auto-Interp
Negative Logits
Wilber
-0.85
ocarp
-0.81
écou
-0.78
laps
-0.76
Nag
-0.73
Mep
-0.72
MEG
-0.70
SNA
-0.69
Fascism
-0.69
Nag
-0.69
POSITIVE LOGITS
those
1.60
Those
1.37
those
1.35
Those
1.32
THOSE
1.16
who
1.12
ceux
1.07
aquellos
1.03
那些
0.95
aqueles
0.95
Activations Density 0.081%