INDEX
    Explanations

    phrases indicating simultaneous events or actions

    references to the concept of time

    New Auto-Interp
    Negative Logits
    nce
    -0.73
    iland
    -0.67
     Founders
    -0.67
    ipedia
    -0.66
    ifer
    -0.66
     fingert
    -0.65
    psey
    -0.64
    ged
    -0.63
    ilater
    -0.61
    halla
    -0.60
    POSITIVE LOGITS
    女
    0.84
     respecting
    0.76
     emphasizing
    0.69
     embracing
    0.69
     minimizing
    0.68
     anticipating
    0.67
     ignoring
    0.67
     keeping
    0.66
     releasing
    0.65
     acknowledging
    0.65
    Act Density 0.029%

    No Known Activations