INDEX
    Explanations

    titles of articles or papers

    instances of the word "titled" and "entitled" indicating the title of a document or article

    New Auto-Interp
    Negative Logits
    eda
    -0.75
    ometers
    -0.72
     squared
    -0.66
    olson
    -0.65
    phthal
    -0.64
     Era
    -0.63
    arthy
    -0.63
    asted
    -0.61
    APD
    -0.60
    âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
    -0.59
    POSITIVE LOGITS
     "#
    0.78
    pins
    0.75
    selves
    0.73
     titled
    0.73
     "<
    0.72
    eous
    0.71
    forward
    0.69
    checks
    0.67
    nces
    0.67
     namely
    0.67
    Act Density 0.028%

    No Known Activations