INDEX
Explanations
phrases referring to a specific group of people or individuals
phrases indicating the existence or actions of specific groups of people
New Auto-Interp
Negative Logits
roundup
-0.66
Ted
-0.63
Courier
-0.63
Annie
-0.61
APD
-0.60
Dialog
-0.59
vag
-0.59
\\\\\\\\
-0.59
\/\/
-0.59
Lou
-0.58
POSITIVE LOGITS
iris
0.88
rir
0.76
mol
0.75
ãĥīãĥ©
0.75
contemplate
0.72
esta
0.70
oppose
0.68
idian
0.68
rame
0.67
perpetuate
0.67
Activations Density 0.121%