INDEX
Explanations
references to interrogation and torture in a governmental context
New Auto-Interp
Negative Logits
aar
-0.14
andom
-0.14
icros
-0.13
ãĥ³ãĥĹ
-0.13
oupon
-0.13
паÑĤ
-0.13
Consumers
-0.13
borough
-0.13
umer
-0.13
ommen
-0.13
POSITIVE LOGITS
interrogation
0.39
torture
0.39
interrog
0.37
water
0.31
Tort
0.30
TORT
0.30
tort
0.29
CIA
0.28
techniques
0.28
tortured
0.25
Activations Density 0.017%