INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Invaders
    -0.81
     WWII
    -0.81
     1914
    -0.80
     informal
    -0.79
     Cooperative
    -0.78
     cooperative
    -0.77
     oldest
    -0.77
     lightly
    -0.74
     Nemesis
    -0.73
     Prohibition
    -0.73
    POSITIVE LOGITS
    software
    1.37
    wiki
    1.32
    blog
    1.24
    youtu
    1.23
    blogs
    1.21
    uploads
    1.15
    lower
    1.14
    analy
    1.13
    general
    1.13
    upload
    1.12
    Act Density 0.015%

    No Known Activations