INDEX
Explanations
the word "Either" indicating a choice between two options
the word "either," indicating alternative choices or options
New Auto-Interp
Negative Logits
riad
-0.75
roxy
-0.73
lights
-0.72
acter
-0.72
appings
-0.71
achus
-0.69
ricting
-0.68
ussions
-0.67
inav
-0.66
appers
-0.65
POSITIVE LOGITS
lift
0.73
either
0.72
individually
0.66
ante
0.64
overtly
0.64
Either
0.63
Rivals
0.63
Option
0.63
implicitly
0.61
side
0.60
Activations Density 0.016%