INDEX
    Explanations

    adjectives that express opinions or perceptions

    New Auto-Interp
    Negative Logits
    itton
    -0.77
    pour
    -0.68
    probably
    -0.64
     instead
    -0.62
    instead
    -0.62
    sometimes
    -0.61
     Probably
    -0.60
    Kings
    -0.60
    rather
    -0.60
    onen
    -0.59
    POSITIVE LOGITS
     anymore
    1.47
     anywhere
    1.10
     nor
    1.06
     anything
    0.99
     any
    0.99
     slightest
    0.97
     necessarily
    0.95
     bothered
    0.90
     whatsoever
    0.87
    yet
    0.85
    Act Density 0.167%

    No Known Activations