INDEX
    Explanations

    figurative language

    This neuron detects instructions about using “figures of speech.”

    New Auto-Interp
    Negative Logits
    -0.06
     nella
    -0.06
    ッシュ
    -0.06
     annoying
    -0.06
    fon
    -0.06
    tingham
    -0.06
    ประกาศ
    -0.06
    оны
    -0.06
    _MISS
    -0.06
     ganze
    -0.06
    POSITIVE LOGITS
    ΗΡ
    0.07
    getInstance
    0.07
     trí
    0.07
     Вот
    0.07
    popover
    0.06
    odal
    0.06
     Deg
    0.06
     resembled
    0.06
     strategies
    0.06
    0.06
    Act Density 0.011%

    No Known Activations