INDEX
Explanations
mentions of interrogation-related words and phrases
terms related to interrogation methods and practices
New Auto-Interp
Negative Logits
Offline
-0.79
ensical
-0.73
jri
-0.71
minecraft
-0.70
yright
-0.69
buy
-0.68
ouf
-0.67
erry
-0.66
WE
-0.65
cakes
-0.65
POSITIVE LOGITS
interrogation
1.01
interrog
0.99
Techniques
0.89
interrogated
0.85
techniques
0.84
questioning
0.68
probing
0.68
sessions
0.67
atories
0.67
coercive
0.67
Activations Density 0.018%