INDEX
Explanations
This neuron detects hedging phrases built around “sort of” (i.e. the pattern “sort of […]”).
New Auto-Interp
Negative Logits
lemma
-0.07
ecological
-0.06
AppBundle
-0.06
(Web
-0.06
.runtime
-0.06
územ
-0.06
sub
-0.06
.inputs
-0.06
그러
-0.06
.forChild
-0.06
POSITIVE LOGITS
ropped
0.07
gal
0.07
olo
0.06
.roll
0.06
()↵
0.06
Populate
0.06
Phil
0.06
(GameObject
0.06
640
0.06
Te
0.06
Activations Density 0.012%