INDEX
Explanations
phrases related to resistance or opposition
phrases related to resistance or opposition
New Auto-Interp
Negative Logits
nces
-0.86
FORE
-0.74
]}
-0.68
terday
-0.66
shire
-0.65
inction
-0.65
SEA
-0.64
ances
-0.64
foundland
-0.64
Stab
-0.64
POSITIVE LOGITS
envelope
0.87
boundaries
0.86
buttons
0.86
agenda
0.78
wedge
0.78
agendas
0.75
brakes
0.75
harder
0.74
rollers
0.74
farther
0.73
Activations Density 0.149%