INDEX
Explanations
descriptions of comparisons or contrasts between different elements
complex relationships or comparisons between quantitative variables or concepts
New Auto-Interp
Negative Logits
hesitated
-0.63
resembled
-0.62
reminded
-0.61
tripled
-0.60
dodged
-0.60
apologized
-0.60
echoed
-0.60
thanked
-0.59
lied
-0.58
glanced
-0.58
POSITIVE LOGITS
counterparts
0.80
ones
0.72
realities
0.72
counterpart
0.68
Telesc
0.65
existing
0.64
incumbent
0.63
extremes
0.63
illon
0.62
orable
0.62
Activations Density 0.652%