INDEX
    Explanations

    The main thing this neuron does is detect occurrences of heating or high‐temperature terms (e.g., “hot,” “heated,” “heating”).

    New Auto-Interp
    Negative Logits
    958
    -0.07
    137
    -0.07
    -light
    -0.06
     хроничес
    -0.06
    853
    -0.06
    UNK
    -0.06
     effectiveness
    -0.06
    bben
    -0.06
     prerequisites
    -0.06
    options
    -0.06
    POSITIVE LOGITS
     Eat
    0.07
     Ire
    0.07
     çözüm
    0.07
     sinc
    0.07
     entren
    0.06
     neckline
    0.06
    ;if
    0.06
     maxY
    0.06
     pand
    0.06
    终于
    0.06
    Act Density 0.010%

    No Known Activations