INDEX
    Explanations

    This neuron activates on the JSON key “reason,” identifying where the response explains or justifies a decision.

    New Auto-Interp
    Negative Logits
     αρι
    -0.07
    чес
    -0.06
    sku
    -0.06
    apyrus
    -0.06
     meats
    -0.06
     Yönetim
    -0.06
    -0.06
    privacy
    -0.06
     پژوه
    -0.06
    ملة
    -0.06
    POSITIVE LOGITS
     sand
    0.07
     anticipated
    0.06
    0.06
    yms
    0.06
    kyně
    0.06
    _render
    0.06
     newPath
    0.06
    ента
    0.06
    Chuck
    0.06
    keiten
    0.06
    Act Density 0.018%

    No Known Activations