INDEX
    Explanations

    Coding discussions

    This neuron responds to phrasing that makes recommendations or suggestions (e.g. modal auxiliaries like “would,” “should,” “could” indicating advice).

    New Auto-Interp
    Negative Logits
     Exploration
    -0.06
     complains
    -0.06
     Pawn
    -0.06
     대학
    -0.06
     fraudulent
    -0.06
    xdc
    -0.06
    .constraints
    -0.05
     öğren
    -0.05
     schw
    -0.05
    Full
    -0.05
    POSITIVE LOGITS
    �n
    0.07
    ريع
    0.07
    注册
    0.07
    0.06
     میل
    0.06
     servisi
    0.06
    0.06
     Extend
    0.06
     ó
    0.06
    0.06
    Act Density 0.117%

    No Known Activations