INDEX
Explanations
phrases related to comparisons or evaluations, particularly emphasizing a contrast between two elements
phrases that include the word "considering."
New Auto-Interp
Negative Logits
inis
-0.79
ernal
-0.78
uala
-0.77
scribe
-0.77
jer
-0.76
orem
-0.75
arez
-0.75
rouse
-0.73
vous
-0.73
inals
-0.71
POSITIVE LOGITS
how
0.93
why
0.72
hindsight
0.70
hordes
0.70
recent
0.70
everything
0.68
what
0.66
everyone
0.66
that
0.65
considering
0.65
Activations Density 0.082%