INDEX
    Explanations

    The neuron detects mentions of awards or honors and related superlative descriptions (e.g., “the award is the highest honor given…”).

    New Auto-Interp
    Negative Logits
    Detector
    -0.08
    ollider
    -0.07
    وات
    -0.07
    ladatel
    -0.06
    нем
    -0.06
    -0.06
    pios
    -0.06
     contradictory
    -0.06
    ัว
    -0.06
    (reverse
    -0.06
    POSITIVE LOGITS
    ATERIAL
    0.07
    cro
    0.06
    406
    0.06
    21
    0.06
    miss
    0.06
    GOOD
    0.06
     영화
    0.06
     weed
    0.06
    errs
    0.06
     MOM
    0.06
    Act Density 0.024%

    No Known Activations