INDEX
    Explanations

    though to a lesser extent, the neuron also identifies government-related terms

    terms related to political content and discussions

    New Auto-Interp
    Negative Logits
    scl
    -0.88
    avorite
    -0.72
    duino
    -0.72
    gypt
    -0.68
    manship
    -0.68
     practicable
    -0.68
    nir
    -0.66
    hire
    -0.66
    HOME
    -0.65
    ylum
    -0.65
    POSITIVE LOGITS
    ized
    1.62
    ization
    1.57
    ised
    1.30
    izing
    1.30
    izations
    1.27
    isation
    1.23
    izes
    1.22
    ize
    1.13
    ified
    1.10
    ation
    1.05
    Act Density 0.036%

    No Known Activations