INDEX
Explanations
introductions
The neuron activates on words in outline or heading phrases (like “Brief,” “Explanation,” “Overview,” etc.), effectively detecting section‐heading/bullet‐point text in an outline.
New Auto-Interp
Negative Logits
Token
-0.07
//--------------------------------
-0.07
documents
-0.06
parents
-0.06
Kelvin
-0.06
.scope
-0.06
Hizmet
-0.06
образ
-0.06
.question
-0.06
Count
-0.06
POSITIVE LOGITS
CURRENT
0.06
lucky
0.06
nat
0.06
ワイト
0.06
dv
0.06
ां
0.06
ocom
0.06
трон
0.06
òa
0.06
SOC
0.06
Activations Density 0.041%