INDEX
    Explanations

    phrases emphasizing the concept of value, importance, or significance

    New Auto-Interp
    Negative Logits
     maximum
    -0.16
     somehow
    -0.16
    .react
    -0.15
    maximum
    -0.15
     somewhere
    -0.15
     Maximum
    -0.15
    Maximum
    -0.14
    æŃ
    -0.14
    sez
    -0.14
     rave
    -0.13
    POSITIVE LOGITS
     degree
    0.20
     weight
    0.19
    urg
    0.19
     detail
    0.18
     importance
    0.18
     amount
    0.18
     effort
    0.17
     pressure
    0.17
     benefit
    0.16
     influence
    0.16
    Act Density 0.100%

    No Known Activations