INDEX
Explanations
The neuron detects the subword fragments of “zombie” (and closely related terms like “apocalypse”).
New Auto-Interp
Negative Logits
Feed
-0.07
Unblock
-0.07
ADM
-0.06
PLAN
-0.06
Showing
-0.06
<Application
-0.06
งหมด
-0.06
658
-0.06
üst
-0.06
在线视频
-0.06
POSITIVE LOGITS
章
0.08
prev
0.07
hoot
0.06
anitize
0.06
\"",↵
0.06
parsley
0.06
тон
0.06
伍
0.06
_finalize
0.06
кишеч
0.06
Activations Density 0.013%