INDEX
    Explanations

    instances of the placeholder token and instances of the word "THE"

    New Auto-Interp
    Negative Logits
    stood
    -0.72
     Bulg
    -0.69
     Schwar
    -0.67
    lets
    -0.65
    gy
    -0.65
    sup
    -0.65
    ãĤ£
    -0.65
     Gong
    -0.64
     Murdoch
    -0.63
    tons
    -0.61
    POSITIVE LOGITS
    BOOK
    1.32
    MAN
    1.30
    ERSON
    1.27
    VERS
    1.27
    ING
    1.25
    ION
    1.23
    FORE
    1.22
    FER
    1.22
    VER
    1.22
    IN
    1.22
    Act Density 0.161%

    No Known Activations