INDEX
    Explanations

    This neuron detects the substring “Inf” at the start of tokens (i.e. words beginning with “inf–”).

    New Auto-Interp
    Negative Logits
     Mage
    -0.08
     Zack
    -0.08
     buckle
    -0.07
     Gong
    -0.07
     Jame
    -0.07
     Game
    -0.07
    Game
    -0.07
    ****************************************
    -0.07
    Kate
    -0.07
     wakeup
    -0.07
    POSITIVE LOGITS
     inf
    0.14
     Inf
    0.13
    Inf
    0.11
    inf
    0.11
    ाध
    0.08
    FI
    0.08
     INF
    0.08
    _INF
    0.08
     isn
    0.07
    inks
    0.07
    Act Density 0.012%

    No Known Activations