INDEX
Explanations
references to actions involving flipping or turning something over
New Auto-Interp
Negative Logits
lain
-0.84
Interstitial
-0.78
hart
-0.76
aph
-0.72
FORE
-0.72
Requires
-0.70
needs
-0.68
chell
-0.68
Annotations
-0.67
hips
-0.67
POSITIVE LOGITS
olitics
0.90
switch
0.87
burgers
0.84
flipped
0.83
flo
0.82
tera
0.80
osition
0.78
flip
0.78
bill
0.78
pend
0.77
Activations Density 0.022%