INDEX
    Explanations

    adjectives conveying importance or severity

    words associated with significance or urgency

    New Auto-Interp
    Negative Logits
    arettes
    -0.83
    runners
    -0.81
    parents
    -0.79
    stories
    -0.78
    users
    -0.78
    ometers
    -0.76
    ubi
    -0.76
     Controls
    -0.76
    owers
    -0.75
    aneers
    -0.74
    POSITIVE LOGITS
     endeavor
    1.11
     piece
    1.05
     feat
    1.02
     foray
    1.01
     distinction
    1.00
     thing
    0.99
     tale
    0.99
     scenario
    0.97
     beast
    0.95
     phenomenon
    0.95
    Act Density 0.148%

    No Known Activations