INDEX
    Explanations

    The neuron primarily detects occurrences of the word “compromise.”

    New Auto-Interp
    Negative Logits
     Sun
    -0.07
     Animal
    -0.07
    ([])↵
    -0.07
    	JLabel
    -0.07
    541
    -0.07
     faculty
    -0.07
     Flint
    -0.07
    ۱۱
    -0.06
    13
    -0.06
     north
    -0.06
    POSITIVE LOGITS
     compromise
    0.12
     compromises
    0.09
     compromising
    0.09
     undermin
    0.08
     compromised
    0.08
     comprom
    0.08
    0.07
    okens
    0.07
    isme
    0.07
     honest
    0.07
    Act Density 0.006%

    No Known Activations