INDEX
Explanations
This neuron detects mentions of appellate review outcomes, especially forms of “affirm” (e.g., “affirm,” “affirming,” “affirmed”).
New Auto-Interp
Negative Logits
ToStr
-0.07
take
-0.07
slept
-0.07
Buk
-0.07
Range
-0.07
got
-0.07
.sys
-0.06
let
-0.06
-move
-0.06
Gos
-0.06
POSITIVE LOGITS
wise
0.07
пищ
0.07
_AS
0.06
수행
0.06
accountability
0.06
(tx
0.06
FactoryGirl
0.06
pivot
0.06
VStack
0.06
wav
0.06
Activations Density 0.004%