INDEX
Explanations
The neuron primarily activates on the word “implementation.”
New Auto-Interp
Negative Logits
grey
-0.06
/epl
-0.06
_no
-0.06
ald
-0.06
cer
-0.06
elps
-0.06
IN
-0.06
ึ้
-0.06
"</
-0.06
.segment
-0.06
POSITIVE LOGITS
로그
0.06
begun
0.06
Feeling
0.06
�
0.06
resentment
0.06
Illegal
0.06
ordial
0.06
kish
0.05
canceled
0.05
???
0.05
Activations Density 0.019%