INDEX
Explanations
references to 'drop' features or actions
references to "drop" in various contexts
New Auto-Interp
Negative Logits
paio
-0.89
gregation
-0.81
apo
-0.73
ILLE
-0.72
aldo
-0.71
urally
-0.70
Gutenberg
-0.70
ancial
-0.69
Seg
-0.65
eur
-0.65
POSITIVE LOGITS
down
1.06
kick
1.05
downs
1.03
phrine
0.91
lights
0.89
bows
0.85
outs
0.85
drop
0.83
backs
0.82
downed
0.82
Activations Density 0.019%