INDEX
Explanations
phrases involving contrast or balance in actions or situations
phrases emphasizing continuity or persistence in actions or states
New Auto-Interp
Negative Logits
uala
-0.68
eah
-0.63
Bre
-0.63
uca
-0.62
hack
-0.62
Roh
-0.62
resa
-0.61
onica
-0.61
éĹĺ
-0.60
rament
-0.59
POSITIVE LOGITS
otherwise
0.84
contrasted
0.82
simultaneously
0.79
ignoring
0.75
nonetheless
0.75
thereby
0.73
forgetting
0.72
nevertheless
0.72
whereas
0.72
unlike
0.70
Activations Density 0.540%