INDEX
    Explanations

    attributions or quotes in text

    instances of people being quoted

    New Auto-Interp
    Negative Logits
    estern
    -0.84
    ĸļ士
    -0.79
    xtap
    -0.75
    ¥ŀ
    -0.74
    ntil
    -0.72
    pleting
    -0.68
    ucha
    -0.66
     earthqu
    -0.65
    \/\/
    -0.64
     Written
    -0.64
    POSITIVE LOGITS
     goodbye
    1.34
     hello
    0.90
     aloud
    0.86
     Goodbye
    0.82
     farewell
    0.81
    mith
    0.76
     loudly
    0.74
     sorry
    0.73
    :]
    0.72
     bluntly
    0.71
    Act Density 0.068%

    No Known Activations