INDEX
    Explanations

    This neuron activates whenever the token “Air” (or “air”) appears, effectively detecting mentions of “air.”

    New Auto-Interp
    Negative Logits
     ['#
    -0.07
     структу
    -0.07
     novice
    -0.07
     Duc
    -0.07
     Prosecutor
    -0.06
     NSStringFromClass
    -0.06
     renovation
    -0.06
    ugu
    -0.06
     Rud
    -0.06
    _pickle
    -0.06
    POSITIVE LOGITS
     air
    0.20
     Air
    0.19
    Air
    0.16
    AIR
    0.14
     AIR
    0.14
    air
    0.13
    -air
    0.12
    _air
    0.11
     
    0.10
     airflow
    0.10
    Act Density 0.025%

    No Known Activations