INDEX
Explanations
phrases expressing concern or fear about potential outcomes or situations
New Auto-Interp
Head Attr Weights
0:0.03
1:0.01
2:0.07
3:0.12
4:0.10
5:0.03
6:0.05
7:0.26
8:0.04
9:0.03
10:0.09
11:0.12
Negative Logits
Reviewer
-1.84
olkien
-1.48
compute
-1.25
guiActiveUn
-1.21
pageant
-1.21
DragonMagazine
-1.20
annotations
-1.20
Completed
-1.20
obligatory
-1.19
Whedon
-1.16
POSITIVE LOGITS
relapse
1.20
signs
1.18
egu
1.17
romising
1.16
uces
1.15
Signs
1.14
fect
1.13
compromising
1.11
ecause
1.10
loo
1.08
Activations Density 0.011%