INDEX
Explanations
conversational exchanges
The neuron fires on tokens at the start of user questions—i.e. it detects when the user is asking a question.
New Auto-Interp
Negative Logits
方法
-0.07
CE
-0.07
airport
-0.07
注意
-0.06
YYYY
-0.06
heiß
-0.06
_cert
-0.06
Abdul
-0.06
험
-0.06
_MULTI
-0.06
POSITIVE LOGITS
latitude
0.06
Suit
0.06
.emplace
0.06
.banner
0.06
Justice
0.06
ба
0.06
*/ ↵ ↵
0.06
нов
0.06
заяви
0.06
_false
0.06
Activations Density 0.068%