INDEX
Explanations
phrases related to 'instead of'
New Auto-Interp
Negative Logits
iky
-0.76
andestine
-0.68
Shake
-0.62
ORN
-0.61
andal
-0.60
Dian
-0.59
Smash
-0.59
anon
-0.58
esses
-0.58
ankind
-0.58
POSITIVE LOGITS
opting
0.89
preferring
0.87
than
0.78
relying
0.76
lect
0.71
than
0.70
choosing
0.69
chose
0.67
opt
0.67
Instead
0.66
Activations Density 0.908%