INDEX
    Explanations

    interesting or engaging information

    mentions of the word "interesting."

    New Auto-Interp
    Negative Logits
    oise
    -0.77
    uts
    -0.73
    eded
    -0.72
    reditation
    -0.71
    helle
    -0.71
    xia
    -0.70
    arest
    -0.70
    heed
    -0.67
    chen
    -0.66
    aping
    -0.66
    POSITIVE LOGITS
     Flavoring
    0.88
     tid
    0.84
    lihood
    0.77
    Magikarp
    0.77
     trivia
    0.75
     twists
    0.74
     sidel
    0.74
     surprises
    0.72
     shade
    0.71
     curiosity
    0.71
    Act Density 0.026%

    No Known Activations