INDEX
    Explanations

    instances where the text discusses potential consequences or outcomes

    descriptions of survival and health-related outcomes

    New Auto-Interp
    Negative Logits
     misunderstanding
    -0.65
     ICO
    -0.60
     misunderstand
    -0.59
     trolling
    -0.56
     SEO
    -0.55
     Architects
    -0.55
     behavi
    -0.55
     miscon
    -0.55
     Firstly
    -0.55
    catentry
    -0.55
    POSITIVE LOGITS
    aterasu
    0.69
     afterward
    0.61
     twice
    0.59
    kefeller
    0.59
     ninety
    0.57
     averaged
    0.57
     Whitman
    0.56
    eligible
    0.55
     virtually
    0.54
    bernatorial
    0.53
    Act Density 1.474%

    No Known Activations