INDEX
    Explanations

    The neuron fires on longer, domain‐specific technical or specialized jargon terms.

    New Auto-Interp
    Negative Logits
     frightened
    -0.07
    Gay
    -0.06
     ApiService
    -0.06
     importantes
    -0.06
    .mutable
    -0.06
     others
    -0.06
    .note
    -0.06
    -0.06
     araya
    -0.06
     Reasons
    -0.06
    POSITIVE LOGITS
     Work
    0.09
     footage
    0.08
     zboží
    0.08
     Conditional
    0.08
     legislation
    0.08
     research
    0.08
     imagery
    0.07
     signage
    0.07
     work
    0.07
     Regulation
    0.07
    Act Density 0.864%

    No Known Activations