INDEX
Explanations
reviews and articles
This neuron detects mentions of the language model’s developing organization or system identifier (e.g., “Large Model Systems Organization (LMSYS)”).
New Auto-Interp
Negative Logits
-simple
-0.06
cellar
-0.06
_ob
-0.06
-python
-0.06
Suc
-0.06
파
-0.06
막
-0.06
""),
-0.06
ーブル
-0.06
elő
-0.06
POSITIVE LOGITS
interface
0.07
–↵↵
0.06
(logging
0.06
Tele
0.06
(guild
0.06
' ↵
0.06
↵ ↵
0.06
součas
0.06
ROME
0.06
dete
0.06
Activations Density 0.004%