INDEX
Explanations
introductory phrases or transitions indicating a statement or explanation
phrases that indicate certainty or express opinions
New Auto-Interp
Negative Logits
obal
-0.88
Cheong
-0.71
ritic
-0.69
ourses
-0.68
hammad
-0.67
oided
-0.66
alg
-0.65
uga
-0.65
Lumpur
-0.64
emouth
-0.63
POSITIVE LOGITS
ties
0.65
erous
0.64
caveats
0.63
WHERE
0.60
!--
0.60
,.
0.59
understatement
0.59
-----------
0.58
bear
0.58
advertising
0.57
Activations Density 0.065%