INDEX
    Explanations

    AI assistant safety guidelines

    This neuron detects prominent headings, section titles, or otherwise emphasized/topic-signaling words in the text.

    key topic nouns in instructional or explanatory content.

    New Auto-Interp
    Negative Logits
     symbolize
    0.31
     tiež
    0.31
     सुरुवात
    0.29
     toning
    0.29
    ल्प
    0.29
    centos
    0.29
    最初に
    0.29
    will
    0.28
    dct
    0.28
    ക്കാണ്
    0.27
    POSITIVE LOGITS
     grabación
    0.34
     dukkham
    0.32
     vítimas
    0.32
     violencia
    0.31
     violência
    0.31
     พระเจ้า
    0.30
     violently
    0.30
     victimes
    0.30
     mish
    0.29
     comercio
    0.29
    Act Density 0.081%

    No Known Activations