INDEX
Explanations
phrases indicating uncertainty or lack of clarity
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.10
3:0.13
4:0.08
5:0.02
6:0.28
7:0.09
8:0.04
9:0.03
10:0.06
11:0.06
Negative Logits
subsequent
-1.30
cellaneous
-1.22
Skydragon
-1.19
margins
-1.17
accompany
-1.14
endeavour
-1.14
concludes
-1.14
版
-1.13
izons
-1.10
ranging
-1.10
POSITIVE LOGITS
anymore
2.10
yet
1.93
necess
1.70
nor
1.59
actly
1.52
entirely
1.49
isible
1.49
eem
1.44
terribly
1.40
bothered
1.35
Activations Density 0.130%