INDEX
    Explanations

    The neuron primarily activates on the word “simple” (as in “I have a simple …”) that appears when the user is describing their scenario.

    New Auto-Interp
    Negative Logits
    วก
    -0.07
     Orchard
    -0.06
    )↵↵
    -0.06
    ewan
    -0.06
    ustain
    -0.06
    ultan
    -0.06
    ()});↵
    -0.06
     pageInfo
    -0.06
    rack
    -0.06
    -0.06
    POSITIVE LOGITS
     simple
    0.07
     fatalities
    0.07
     özel
    0.07
    .scal
    0.07
     boto
    0.07
     cerca
    0.07
    controlled
    0.06
    _comment
    0.06
    toLowerCase
    0.06
    เฮ
    0.06
    Act Density 0.026%

    No Known Activations