INDEX
    Explanations

    references to academic publications or studies

    New Auto-Interp
    Negative Logits
     tears
    -0.80
     flush
    -0.70
    itness
    -0.66
    acea
    -0.62
    irl
    -0.61
    keyes
    -0.60
     horizon
    -0.60
     wand
    -0.59
    bed
    -0.58
     blaster
    -0.57
    POSITIVE LOGITS
     Fifth
    0.80
     Notting
    0.75
    Reviewer
    0.75
     Tenth
    0.74
     Ninth
    0.72
    interstitial
    0.72
     Seventh
    0.71
     Proceedings
    0.69
    Method
    0.67
    Nit
    0.66
    Act Density 0.065%

    No Known Activations