INDEX
    Explanations

    This neuron activates on occurrences of the adjective “important.”

    New Auto-Interp
    Negative Logits
    32
    -0.07
    266
    -0.07
     scan
    -0.07
     wake
    -0.07
     Ethernet
    -0.07
    34
    -0.07
    agation
    -0.07
    392
    -0.07
     sea
    -0.07
    340
    -0.07
    POSITIVE LOGITS
     important
    0.10
     Important
    0.09
     importance
    0.09
     IMPORTANT
    0.08
    Important
    0.07
     Importance
    0.07
     vriend
    0.07
    _REQUIRE
    0.07
     gehört
    0.07
     Hancock
    0.07
    Act Density 0.049%

    No Known Activations