INDEX
    Explanations

    This neuron is looking for specific terms related to stories or events in science fiction or fantasy contexts

    words related to the concept of sabotage

    New Auto-Interp
    Head Attr Weights
    0:0.05
    1:0.02
    2:0.23
    3:0.06
    4:0.24
    5:0.05
    6:0.03
    7:0.04
    8:0.05
    9:0.09
    10:0.05
    11:0.02
    Negative Logits
    bage
    -1.55
    wikipedia
    -1.49
    nces
    -1.46
    steen
    -1.42
    miah
    -1.34
    ndra
    -1.33
    aceae
    -1.28
     MAP
    -1.27
    untarily
    -1.26
    lishes
    -1.25
    POSITIVE LOGITS
    WARE
    1.37
     bells
    1.36
     pains
    1.20
    ooth
    1.18
     labor
    1.12
     ratt
    1.06
    oman
    1.06
    ioxide
    1.05
    Labor
    1.05
     deb
    1.04
    Act Density 0.001%

    No Known Activations