INDEX
    Explanations

    The neuron activates on text describing magnetism—terms about magnets, magnetic forces, and related materials.

    New Auto-Interp
    Negative Logits
    348
    -0.07
    ールド
    -0.06
     assertEquals
    -0.06
     hero
    -0.06
    	assertEquals
    -0.06
    arto
    -0.06
    ostel
    -0.06
    attributes
    -0.06
     raids
    -0.06
     otev
    -0.06
    POSITIVE LOGITS
     resin
    0.08
     Alic
    0.07
    -project
    0.06
     inspir
    0.06
    0.06
     Diet
    0.06
     fotoğraf
    0.06
     breast
    0.06
    (rest
    0.06
     chiff
    0.06
    Act Density 0.034%

    No Known Activations