INDEX
    Explanations

    words related to gates or barriers that can be opened or closed

    words related to states, conditions, or processes

    New Auto-Interp
    Negative Logits
    nces
    -0.73
    æµ
    -0.73
    orks
    -0.65
     temp
    -0.63
    DX
    -0.62
    nant
    -0.62
    ounces
    -0.60
     Ashes
    -0.60
    200000
    -0.60
    nia
    -0.58
    POSITIVE LOGITS
     separating
    0.80
     fences
    0.76
     guarding
    0.76
    otes
    0.72
     guarded
    0.70
    utherford
    0.69
     fortun
    0.69
     entrances
    0.69
    riers
    0.68
     manned
    0.68
    Act Density 0.063%

    No Known Activations