INDEX
    Explanations

    multiple languages

    The neuron strongly activates on single‐word affirmative replies (e.g. “Да,” “Sim,” etc.), i.e. short tokens meaning “yes.”

    New Auto-Interp
    Negative Logits
    _HOOK
    -0.06
    Leo
    -0.06
    workers
    -0.06
     nestled
    -0.06
    Book
    -0.06
     před
    -0.06
     Herbert
    -0.06
    497
    -0.06
    iset
    -0.05
     UAE
    -0.05
    POSITIVE LOGITS
    _FACTOR
    0.07
    _PID
    0.07
     trộn
    0.07
     toItem
    0.06
    0.06
    ;.
    0.06
    ormal
    0.06
    0.06
    ------------↵
    0.06
    0.06
    Act Density 0.018%

    No Known Activations