INDEX
    Explanations

    equals sign

    This neuron never fires on any tokens, so it isn’t detecting any consistent pattern.

    New Auto-Interp
    Negative Logits
    sters
    -0.07
    tablet
    -0.06
     arrangement
    -0.06
    procedure
    -0.06
    Keywords
    -0.06
     obligated
    -0.06
     watts
    -0.06
    ύτε
    -0.06
     Monday
    -0.06
     Panc
    -0.06
    POSITIVE LOGITS
     fillColor
    0.06
    ENTIC
    0.06
     december
    0.06
    ennifer
    0.06
     {↵
    0.06
     republican
    0.06
    _CUDA
    0.06
    ↵    ↵↵
    0.06
    ProgressBar
    0.06
    EGA
    0.06
    Act Density 0.011%

    No Known Activations