INDEX
Explanations
phrases related to isolation or confinement
references to solitary confinement and its implications
New Auto-Interp
Negative Logits
oppers
-0.76
ichick
-0.71
exch
-0.70
raught
-0.68
enic
-0.67
soDeliveryDate
-0.67
orah
-0.67
ermott
-0.66
Clash
-0.66
enegger
-0.66
POSITIVE LOGITS
confinement
1.69
solitary
1.19
spection
0.73
inmate
0.72
dwellings
0.69
icol
0.68
prisoner
0.68
ism
0.67
minded
0.67
pen
0.66
Activations Density 0.013%