INDEX
    Explanations

    authors and studies mentioned in research papers

    phrases that indicate authorship or references to reports and studies

    New Auto-Interp
    Negative Logits
    pring
    -0.70
    llan
    -0.65
     twitch
    -0.63
    Enlarge
    -0.62
     dude
    -0.62
    hunt
    -0.61
     vibration
    -0.61
     broom
    -0.61
    RL
    -0.60
    _-
    -0.59
    POSITIVE LOGITS
     books
    0.78
     articles
    0.77
     Letters
    0.74
    letters
    0.74
     bestselling
    0.72
    Ô
    0.72
     novels
    0.71
     memoir
    0.71
     unpublished
    0.70
     poems
    0.69
    Act Density 0.078%

    No Known Activations