INDEX
    Explanations

    forum posts

    This neuron responds to the little floating‐point confidence scores and metadata values (the numeric “0.xxx” tokens) embedded in the log.

    New Auto-Interp
    Negative Logits
    -0.07
     cable
    -0.07
    ynes
    -0.07
     Stunden
    -0.07
     steam
    -0.06
     Readers
    -0.06
    mask
    -0.06
     engineering
    -0.06
    ance
    -0.06
    handler
    -0.06
    POSITIVE LOGITS
     تشکیل
    0.06
    '][$
    0.06
     adet
    0.06
     Taipei
    0.05
     تمامی
    0.05
    _digest
    0.05
    Before
    0.05
    ποτε
    0.05
    ้อม
    0.05
     한국
    0.05
    Act Density 0.034%

    No Known Activations