INDEX
Explanations
phrases related to agreement or disagreement
phrases indicating disagreement or criticism
New Auto-Interp
Negative Logits
ario
-0.68
atto
-0.65
iture
-0.65
notwithstanding
-0.62
Donation
-0.61
etermination
-0.60
solution
-0.60
leep
-0.60
disposed
-0.59
urated
-0.59
POSITIVE LOGITS
those
0.72
those
0.69
these
0.68
our
0.68
us
0.64
them
0.64
their
0.64
Adams
0.61
their
0.60
his
0.60
Activations Density 0.150%