INDEX
    Explanations

    This neuron activates on subjective evaluative adjectives and adverbs that express opinions or judgments about quality (e.g. “poor,” “positive,” “prestigious,” “catastrophic”).

    New Auto-Interp
    Negative Logits
     Birmingham
    -0.07
     Kentucky
    -0.07
     d
    -0.07
    posing
    -0.06
     dull
    -0.06
    -0.06
    .UndefOr
    -0.06
    REL
    -0.06
     Officers
    -0.06
     werde
    -0.06
    POSITIVE LOGITS
    ropa
    0.06
    -opt
    0.06
     Бел
    0.06
    INIT
    0.06
     svět
    0.06
    етод
    0.06
    ственное
    0.06
    aciones
    0.06
     spi
    0.06
    aris
    0.06
    Act Density 0.032%

    No Known Activations