INDEX
Explanations
mentions of actions of blocking or being blocked (e.g., "Blocked", "Unblock", "Blocking")
occurrences of the word "blocked."
New Auto-Interp
Negative Logits
Sov
-0.72
shown
-0.67
gain
-0.67
abund
-0.67
repre
-0.65
present
-0.64
rift
-0.64
nucleus
-0.64
eah
-0.63
bons
-0.63
POSITIVE LOGITS
ãĤ´ãĥ³
0.82
ĵĺ
0.82
utsche
0.72
ogging
0.71
icer
0.68
wana
0.68
ESCO
0.67
iard
0.66
Lists
0.65
jamin
0.64
Activations Density 0.022%