INDEX
    Explanations

    references to physical disconnection or amputation

    phrases related to cutting off or severing connections, resources, or limbs

    New Auto-Interp
    Negative Logits
    Bench
    -0.80
    WM
    -0.73
    antine
    -0.72
    Rating
    -0.70
    episode
    -0.69
    ECH
    -0.68
    FUL
    -0.68
    MAT
    -0.67
    older
    -0.65
    OD
    -0.65
    POSITIVE LOGITS
     communication
    1.06
     access
    1.02
     contact
    0.98
     limbs
    0.89
     ties
    0.87
     valves
    0.86
     communications
    0.86
     limb
    0.84
     supply
    0.84
     disbelief
    0.83
    Act Density 0.067%

    No Known Activations