INDEX
Explanations
questions
This neuron never activates—it doesn’t detect or respond to any token patterns.
the neuron detects interrogative/question cues — tokens that start or appear in questions (question words and auxiliaries used to form questions).
New Auto-Interp
Negative Logits
overtime
-0.07
.setPassword
-0.07
Var
-0.07
Tel
-0.06
ilities
-0.06
_RED
-0.06
bán
-0.06
waktu
-0.06
еч
-0.06
alara
-0.06
POSITIVE LOGITS
Shall
0.07
theorem
0.07
Mourinho
0.06
agog
0.06
andscape
0.06
GOOD
0.06
Mechan
0.06
erdem
0.06
olumsuz
0.06
Sexy
0.06
Activations Density 0.118%