INDEX
Explanations
mentions of recorded tapes or videos related to controversial subjects
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.07
3:0.07
4:0.06
5:0.06
6:0.03
7:0.14
8:0.03
9:0.04
10:0.32
11:0.09
Negative Logits
ategor
-1.79
concess
-1.78
advant
-1.70
specialization
-1.69
denomin
-1.66
GCC
-1.62
vector
-1.62
grep
-1.57
Ability
-1.57
docker
-1.56
POSITIVE LOGITS
tapes
2.93
recordings
2.63
videot
2.41
taped
2.41
tape
2.12
recording
2.02
interrogation
1.83
incrim
1.74
allegations
1.73
wiret
1.71
Activations Density 0.004%