INDEX
    Explanations

    The neuron detects descriptive adjectives referring to a person’s physical appearance.

    New Auto-Interp
    Negative Logits
    력이
    -0.07
     полити
    -0.07
    Until
    -0.07
     trademarks
    -0.06
    $/)
    -0.06
     دنیا
    -0.06
    )에
    -0.06
    .:.
    -0.06
     Isn
    -0.06
    ازد
    -0.06
    POSITIVE LOGITS
    UNIT
    0.08
    Sit
    0.07
    Entr
    0.06
    (hdr
    0.06
     stronghold
    0.06
    ior
    0.06
    ελ
    0.06
    Low
    0.06
     getter
    0.06
    φαρ
    0.06
    Act Density 0.493%

    No Known Activations