INDEX
    Explanations

    the word "back" with varying intensity across different contexts

    New Auto-Interp
    Negative Logits
     Parenthood
    -0.77
    inational
    -0.71
    risome
    -0.68
    ISION
    -0.67
    cules
    -0.67
    kish
    -0.63
    isions
    -0.61
    cular
    -0.61
    entric
    -0.61
    女
    -0.61
    POSITIVE LOGITS
    wards
    1.21
    lash
    1.21
    doors
    1.03
    door
    1.03
    packing
    1.01
    ward
    1.00
    GROUND
    0.99
    dated
    0.95
    haul
    0.94
    strap
    0.94
    Act Density 0.034%

    No Known Activations