INDEX
    Explanations

    appropriateness

    This neuron remains inactive on all input tokens and does not respond to any particular pattern.

    New Auto-Interp
    Negative Logits
    Simple
    -0.07
     Sheep
    -0.07
     CELL
    -0.07
    -0.06
     particle
    -0.06
     Rotate
    -0.06
    Hard
    -0.06
     rebels
    -0.06
     maintenance
    -0.06
     presentation
    -0.06
    POSITIVE LOGITS
    λεκ
    0.06
    ा.↵
    0.06
     наш
    0.06
     จะ
    0.06
    0.06
     тран
    0.06
     :)↵
    0.06
     แก
    0.06
     순간
    0.06
     ож
    0.06
    Act Density 0.027%

    No Known Activations