INDEX
    Explanations

    distractions

    The neuron detects mentions of distracting or diversion actions (e.g., “distracts,” “distraction”).

    New Auto-Interp
    Negative Logits
    .options
    -0.07
     contemplating
    -0.07
    Hamilton
    -0.07
     graffiti
    -0.07
     express
    -0.06
    address
    -0.06
     Steering
    -0.06
     Deutsche
    -0.06
     pioneer
    -0.06
     entails
    -0.06
    POSITIVE LOGITS
    .Ed
    0.07
     происходит
    0.06
     foi
    0.06
     มกราคม
    0.06
    。不
    0.06
    .Fat
    0.06
    0.06
    ilmek
    0.06
     Decompiled
    0.06
    اورزی
    0.06
    Act Density 0.065%

    No Known Activations