INDEX
    Explanations

    This neuron seems to activate on somewhat random words and phrases, perhaps short function words or verb phrases, and the content doesn't appear to create a coherent meaning.

    New Auto-Interp
    Negative Logits
     mó
    -0.07
    moid
    -0.06
     slightly
    -0.06
    ãĤįãģĨ
    -0.06
    åħį
    -0.06
    invalid
    -0.06
     вполне
    -0.06
    irim
    -0.06
     Invalid
    -0.06
    æľī人
    -0.06
    POSITIVE LOGITS
     limited
    0.17
     lack
    0.17
     absence
    0.15
     lacking
    0.15
    limited
    0.14
     lacks
    0.14
    lack
    0.13
     minimal
    0.13
     lacked
    0.13
     Lack
    0.12
    Act Density 0.054%

    No Known Activations