INDEX
Explanations
phrases related to holding law enforcement accountable and discussing their actions beyond what is necessary
proper nouns, specifically names of individuals
New Auto-Interp
Negative Logits
conver
-0.69
pport
-0.68
upside
-0.67
overwhelming
-0.66
antic
-0.65
planes
-0.65
stereotypes
-0.65
overhe
-0.64
taxing
-0.64
tro
-0.61
POSITIVE LOGITS
Jr
1.53
Sr
1.25
PhD
1.08
MD
1.02
Jr
1.02
aka
0.93
Es
0.92
JR
0.86
MPH
0.84
JD
0.82
Activations Density 0.169%