INDEX
Explanations
The neuron detects occurrences of the word “prompt” (and its close variants) in the text.
New Auto-Interp
Negative Logits
τογραφ
-0.07
irectory
-0.07
Publishing
-0.07
.cursor
-0.06
ิเวณ
-0.06
Ross
-0.06
Ant
-0.06
막
-0.06
진행
-0.06
slur
-0.06
POSITIVE LOGITS
>'.
0.06
_IC
0.06
toa
0.06
tog
0.06
styled
0.06
gio
0.06
tad
0.06
Norfolk
0.06
.removeClass
0.06
@FindBy
0.06
Activations Density 0.009%