INDEX
    Explanations

    The neuron fires on occurrences of the standalone word “Drop” (often part of a brand, product, or title).

    New Auto-Interp
    Negative Logits
     Sail
    -0.07
     Planning
    -0.07
     Wei
    -0.06
     Management
    -0.06
     sight
    -0.06
     Neil
    -0.06
     ingenious
    -0.06
     quân
    -0.06
     planning
    -0.06
     VF
    -0.06
    POSITIVE LOGITS
     drop
    0.16
     Drop
    0.14
     dropped
    0.13
    Drop
    0.13
     DROP
    0.12
     dropping
    0.12
    -drop
    0.11
    drop
    0.10
    0.10
     dropout
    0.10
    Act Density 0.019%

    No Known Activations