INDEX
    Explanations

    This neuron detects the kind of explanatory language used when pointing out puns or phonetic similarities (e.g. tokens like “sounds,” “similar,” and quoted word comparisons).

    New Auto-Interp
    Negative Logits
    brane
    -0.06
    _mid
    -0.06
    242
    -0.06
    otate
    -0.06
    _management
    -0.06
     caffeine
    -0.06
    omial
    -0.06
     ad
    -0.06
    _ctr
    -0.05
    čil
    -0.05
    POSITIVE LOGITS
     inclusive
    0.07
     BTN
    0.07
    diamond
    0.07
    اقتص
    0.07
     mari
    0.06
     alınan
    0.06
    rule
    0.06
     prostituerade
    0.06
    JEXEC
    0.06
    0.06
    Act Density 0.038%

    No Known Activations