INDEX
Explanations
references to prisons and incarceration
New Auto-Interp
Negative Logits
å²Ĺ
-0.16
inue
-0.16
ado
-0.14
Pros
-0.14
å´
-0.14
-priced
-0.14
iscrim
-0.13
.addField
-0.13
ataset
-0.13
ãĥ¼ãĥį
-0.13
POSITIVE LOGITS
.opensource
0.18
ocker
0.18
326
0.15
ateg
0.15
cci
0.15
eee
0.15
IRON
0.15
ÙĪØµ
0.15
orio
0.15
éĩ
0.14
Activations Density 0.034%