INDEX
Explanations
phrases indicating certainty or inevitability
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.17
3:0.08
4:0.03
5:0.05
6:0.11
7:0.03
8:0.05
9:0.05
10:0.05
11:0.29
Negative Logits
FY
-1.86
Darling
-1.74
rehearsal
-1.71
schedules
-1.70
Studios
-1.69
pregnancies
-1.68
resolutions
-1.65
airs
-1.62
Abbott
-1.61
Cran
-1.60
POSITIVE LOGITS
ibaba
2.08
gdala
2.02
alions
1.77
andise
1.75
atility
1.75
orsi
1.74
MpServer
1.68
folios
1.66
hemat
1.65
cipl
1.61
Activations Density 0.006%