INDEX
Explanations
forum posts
This neuron responds to the little floating‐point confidence scores and metadata values (the numeric “0.xxx” tokens) embedded in the log.
New Auto-Interp
Negative Logits
文
-0.07
cable
-0.07
ynes
-0.07
Stunden
-0.07
steam
-0.06
Readers
-0.06
mask
-0.06
engineering
-0.06
ance
-0.06
handler
-0.06
POSITIVE LOGITS
تشکیل
0.06
'][$
0.06
adet
0.06
Taipei
0.05
تمامی
0.05
_digest
0.05
Before
0.05
ποτε
0.05
้อม
0.05
한국
0.05
Activations Density 0.034%