INDEX
    Explanations

    scientific/academic texts

    New Auto-Interp
    Negative Logits
     perpetrated
    -0.07
    -0.07
     ceasefire
    -0.07
    culos
    -0.07
    -0.07
    愛情
    -0.07
    -0.07
     exercised
    -0.07
    _-_
    -0.07
    原则
    -0.06
    POSITIVE LOGITS
    fts
    0.07
    _RED
    0.07
    いで
    0.07
    _success
    0.07
     Purdue
    0.06
     bbc
    0.06
     spielen
    0.06
     suo
    0.06
     unfolding
    0.06
     audition
    0.06
    Act Density 0.005%

    No Known Activations