INDEX
    Explanations

    instances of the word "compelling."

    references to engaging or persuasive content

    New Auto-Interp
    Negative Logits
    hops
    -0.89
    pez
    -0.84
    hop
    -0.80
    sterdam
    -0.77
    atel
    -0.71
    alde
    -0.65
    ource
    -0.64
    pec
    -0.64
     Sloan
    -0.63
    abad
    -0.60
    POSITIVE LOGITS
    ly
    1.05
    ingly
    1.00
    NESS
    0.84
    LY
    0.83
    ively
    0.82
    enough
    0.79
     enough
    0.79
    ments
    0.76
    ibly
    0.76
    reason
    0.76
    Act Density 0.028%

    No Known Activations