INDEX
Explanations
references to vigilance or watching for something
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.11
3:0.06
4:0.17
5:0.02
6:0.03
7:0.36
8:0.03
9:0.04
10:0.05
11:0.06
Negative Logits
YouTube
-1.47
Engineers
-1.47
sham
-1.46
CTV
-1.46
rane
-1.33
isec
-1.32
NBC
-1.31
SourceFile
-1.28
init
-1.27
wink
-1.26
POSITIVE LOGITS
abilia
1.67
auder
1.67
efer
1.66
goodies
1.58
barg
1.58
elusive
1.53
furnish
1.53
aband
1.51
lett
1.45
affordability
1.45
Activations Density 0.001%