INDEX
    Explanations

    phrases or contexts that convey curiosity or remarkability

    New Auto-Interp
    Negative Logits
    uts
    -0.74
    arest
    -0.73
    oise
    -0.72
    avers
    -0.71
    aper
    -0.71
    required
    -0.71
    ussy
    -0.70
    uter
    -0.68
    otent
    -0.67
    reditation
    -0.66
    POSITIVE LOGITS
     tid
    0.98
     Flavoring
    0.85
     sidel
    0.83
     twists
    0.82
     anecdotes
    0.82
     insights
    0.82
     trivia
    0.82
    arios
    0.81
    ness
    0.78
     observations
    0.76
    Act Density 0.026%

    No Known Activations