INDEX
    Explanations

    phrases related to goodbyes or farewells

    expressions of personal emotions and experiences

    New Auto-Interp
    Negative Logits
    '."
    -0.62
     respectively
    -0.59
    ."
    -0.57
    arettes
    -0.56
    $$$$
    -0.56
     thereby
    -0.55
    ".
    -0.55
    angering
    -0.54
     vouchers
    -0.54
    åĮ
    -0.53
    POSITIVE LOGITS
     spoiler
    0.61
     nutshell
    0.60
     spoilers
    0.60
     reader
    0.58
     Blog
    0.57
     clarification
    0.56
     blog
    0.55
     geek
    0.53
    apache
    0.52
     disclaimer
    0.52
    Act Density 1.115%

    No Known Activations