INDEX
    Explanations

    words expressing amazement or awe

    expressions of wonder and admiration

    New Auto-Interp
    Negative Logits
     Thieves
    -0.70
    gradient
    -0.65
    secondary
    -0.65
     FF
    -0.63
    containing
    -0.63
    dule
    -0.63
     Agents
    -0.63
    Solid
    -0.62
    Short
    -0.62
    Gamer
    -0.61
    POSITIVE LOGITS
     awe
    1.18
    htaking
    0.92
    urous
    0.91
     aston
    0.91
     incred
    0.87
     amaz
    0.86
    ruciating
    0.85
    ingly
    0.85
    upe
    0.84
     amazed
    0.82
    Act Density 0.013%

    No Known Activations