INDEX
Explanations
interviews with individuals
New Auto-Interp
Negative Logits
math
-0.81
ciplinary
-0.77
mil
-0.74
blue
-0.72
Goods
-0.71
hover
-0.69
axy
-0.68
cil
-0.64
Haram
-0.64
sels
-0.64
POSITIVE LOGITS
interview
1.01
ees
0.96
interviews
0.94
conducted
0.90
transcripts
0.88
aired
0.87
booth
0.85
Interview
0.84
Transcript
0.82
taped
0.81
Activations Density 0.521%