INDEX
Explanations
terms related to prisons and incarceration
references to prisons and the prison system
New Auto-Interp
Negative Logits
lass
-0.80
thora
-0.75
yip
-0.74
rians
-0.71
rian
-0.68
ï¸ı
-0.67
witz
-0.65
sound
-0.65
abeth
-0.64
£ı
-0.64
POSITIVE LOGITS
inmates
1.04
inmate
0.96
sentences
0.88
prisons
0.87
prisoners
0.86
prison
0.84
house
0.81
prison
0.81
Guantanamo
0.81
rehabilitation
0.79
Activations Density 0.043%