INDEX
Explanations
phrases related to comparison or opposition (e.g., "good vs evil", "high vs low sensitivity")
conjunctions and prepositions indicating relationships or comparisons
New Auto-Interp
Negative Logits
wagen
-0.69
nen
-0.69
Players
-0.68
illac
-0.64
Cosponsors
-0.63
sugg
-0.62
condem
-0.62
misunder
-0.62
elle
-0.61
Uncommon
-0.61
POSITIVE LOGITS
infinity
0.73
lowly
0.72
emptiness
0.64
messenger
0.63
minus
0.63
voic
0.63
decay
0.63
suffix
0.61
math
0.60
multiply
0.60
Activations Density 0.333%