INDEX
    Explanations

    references to emptiness or abandonment

    New Auto-Interp
    Negative Logits
    arya
    -0.80
    ect
    -0.70
    Downloadha
    -0.68
    Murray
    -0.67
    abol
    -0.67
    ection
    -0.67
    ector
    -0.65
    arin
    -0.65
    iane
    -0.64
     appropri
    -0.62
    POSITIVE LOGITS
     space
    0.92
     spaces
    0.91
     shelves
    0.86
     shells
    0.81
     calories
    0.81
     stomach
    0.80
     storefront
    0.79
     bottles
    0.79
     Spaces
    0.78
     cavity
    0.74
    Act Density 0.094%

    No Known Activations