INDEX
Explanations
Common English words
The neuron responds to words expressing recommendations or instructions (e.g. “should,” “need,” “making,” “please”).
New Auto-Interp
Negative Logits
|↵↵
-0.07
!*
-0.07
/******/↵
-0.06
>())↵
-0.06
-----------↵↵
-0.06
?>'
-0.06
HOME
-0.06
::↵↵
-0.06
疾
-0.06
>()↵↵
-0.06
POSITIVE LOGITS
이동
0.06
ife
0.06
ويك
0.06
yüzde
0.06
prostřed
0.06
downtown
0.06
ampilkan
0.06
venient
0.06
UCCESS
0.06
Responsible
0.06
Activations Density 0.210%