INDEX
    Explanations

    references to the study or paper being discussed

    New Auto-Interp
    Negative Logits
     itſelf
    -0.82
    )++;
    -0.79
     للاسماء
    -0.75
     myſelf
    -0.74
     '\\;'
    -0.71
     ++)
    -0.71
     ſeveral
    -0.70
     themſelves
    -0.69
     Eſ
    -0.67
    ſelf
    -0.67
    POSITIVE LOGITS
     paper
    1.43
    paper
    1.06
     report
    0.96
     Paper
    0.91
    Paper
    0.89
     article
    0.87
     PAPER
    0.79
     thesis
    0.74
     study
    0.71
    PAPER
    0.70
    Act Density 0.388%

    No Known Activations