INDEX
    Explanations

    the word "that" at a high activation level

    the word "that" used in various contexts

    New Auto-Interp
    Negative Logits
    EMBER
    -0.66
    cept
    -0.64
     ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
    -0.63
    arest
    -0.62
    Tank
    -0.62
    izont
    -0.62
    ãĥ¡
    -0.62
    Pont
    -0.61
    ãĥīãĥ©
    -0.59
    ãĤ·
    -0.59
    POSITIVE LOGITS
     although
    0.95
     "[
    0.86
     '[
    0.76
     sounded
    0.72
     whilst
    0.69
    soever
    0.68
     while
    0.67
     "...
    0.66
    utenberg
    0.64
     contradicts
    0.64
    Act Density 0.193%

    No Known Activations