INDEX
Explanations
question
The neuron activates on occurrences of the word “question” (and its surrounding punctuation) in the prompt text.
New Auto-Interp
Negative Logits
版
-0.07
mirror
-0.07
arnation
-0.07
با
-0.07
apping
-0.07
viewPager
-0.07
inauguration
-0.07
/>,↵
-0.07
BR
-0.07
xCA
-0.07
POSITIVE LOGITS
minimized
0.06
undert
0.06
conqu
0.06
월까지
0.06
χεί
0.06
discrepancies
0.06
QIcon
0.06
狀
0.05
://{0.05
-dd
0.05
Activations Density 0.011%