INDEX
Explanations
words related to actions of obstructing or preventing something
instances of the words "blocking" and "blocked"
New Auto-Interp
Negative Logits
llers
-0.80
gow
-0.80
ller
-0.74
lli
-0.73
brates
-0.70
BUS
-0.70
prise
-0.68
mble
-0.68
brate
-0.68
EMBER
-0.65
POSITIVE LOGITS
busters
0.97
quote
0.93
buster
0.88
aded
0.84
blocking
0.83
ades
0.83
picking
0.76
ading
0.76
chains
0.73
osuke
0.72
Activations Density 0.026%