INDEX
Explanations
phrases indicating a cause-and-effect relationship
phrases indicating causes and effects or consequences
New Auto-Interp
Negative Logits
utenberg
-0.67
ategories
-0.63
anchester
-0.60
tty
-0.57
ounding
-0.56
ufact
-0.56
kindred
-0.55
essen
-0.55
Week
-0.55
jab
-0.54
POSITIVE LOGITS
thereafter
0.87
onwards
0.80
they
0.79
there
0.79
nobody
0.74
THERE
0.72
forth
0.71
thereof
0.71
we
0.69
it
0.68
Activations Density 0.259%