INDEX
    Explanations

    descriptive phrases related to the physical environment

    New Auto-Interp
    Negative Logits
    chairs
    -0.82
    idents
    -0.80
    encers
    -0.77
    yond
    -0.73
    olas
    -0.73
    eeds
    -0.72
    lees
    -0.71
    bots
    -0.71
    enes
    -0.71
    masters
    -0.69
    POSITIVE LOGITS
     hurdle
    1.13
     installment
    0.97
     thing
    0.97
     glimpse
    0.94
     dose
    0.94
     reminder
    0.92
     glance
    0.90
     chance
    0.90
     disclaimer
    0.88
     piece
    0.87
    Act Density 0.074%

    No Known Activations