INDEX
Explanations
phrases indicating a detailed explanation or discussion
phrases indicating explanation or elaboration
New Auto-Interp
Negative Logits
nor
-0.65
mouth
-0.63
oi
-0.59
Canaveral
-0.58
opped
-0.57
canceled
-0.55
arov
-0.54
mson
-0.54
Atl
-0.54
Releases
-0.54
POSITIVE LOGITS
below
1.53
here
1.25
briefly
1.17
HERE
1.15
below
1.12
herein
1.11
extensively
1.10
hereafter
1.04
later
1.03
momentarily
1.01
Activations Density 0.306%