INDEX
Explanations
references to the American Civil Liberties Union (ACLU)
mentions of the ACLU and related organizations
New Auto-Interp
Negative Logits
rule
-0.80
lasses
-0.78
xit
-0.74
uled
-0.72
lass
-0.72
history
-0.69
hy
-0.67
gradient
-0.66
dating
-0.65
fecture
-0.64
POSITIVE LOGITS
Fein
0.92
ACLU
0.92
Horowitz
0.87
ICE
0.83
DEF
0.81
OC
0.81
IG
0.80
ISON
0.79
ARI
0.79
UTH
0.76
Activations Density 0.013%