INDEX
Explanations
instances related to isolation or confinement
references to solitary confinement and related themes of isolation
New Auto-Interp
Negative Logits
oppers
-0.83
ichick
-0.71
Mass
-0.69
amins
-0.69
efficients
-0.67
nostic
-0.67
soDeliveryDate
-0.67
ACTED
-0.67
inki
-0.66
rote
-0.66
POSITIVE LOGITS
confinement
1.52
solitary
1.21
uously
0.69
minded
0.69
spection
0.68
lone
0.67
theless
0.66
itud
0.66
plum
0.65
uous
0.65
Activations Density 0.006%