INDEX
    Explanations

    phrases related to hard work or intense effort

    instances of the word "hard" indicating difficulty or effort

    New Auto-Interp
    Negative Logits
    uador
    -0.72
     Tyrann
    -0.68
    ipt
    -0.63
    umbn
    -0.63
     republic
    -0.63
     Avalon
    -0.60
     Tsukuyomi
    -0.59
    ãĥĺãĥ©
    -0.59
    gdala
    -0.59
     Reincarn
    -0.58
    POSITIVE LOGITS
    working
    1.28
    wired
    1.19
    coded
    1.17
    ening
    1.04
    ball
    0.95
    core
    0.93
    hitting
    0.91
    BALL
    0.91
     pressed
    0.86
     hitting
    0.84
    Act Density 0.034%

    No Known Activations