INDEX
Explanations
expressions of doubt or disbelief
New Auto-Interp
Head Attr Weights
0:0.07
1:0.03
2:0.08
3:0.10
4:0.02
5:0.04
6:0.12
7:0.14
8:0.10
9:0.07
10:0.09
11:0.11
Negative Logits
{:-1.21
Painting
-1.19
CLR
-1.19
ラン
-1.03
DOC
-1.02
RAD
-1.00
ガ
-0.96
Engineers
-0.94
..............
-0.94
Pear
-0.94
POSITIVE LOGITS
myself
1.52
yet
1.31
tarian
1.13
bol
1.12
anymore
1.08
qus
1.04
too
1.01
inge
0.98
Nor
0.98
iden
0.98
Activations Density 0.030%