INDEX
    Explanations

    occurrences of the word "the."

    New Auto-Interp
    Negative Logits
    rzy
    -0.17
    zig
    -0.16
    ensem
    -0.16
    formats
    -0.15
    rych
    -0.15
    lea
    -0.14
    panse
    -0.14
    #
    -0.14
    .idea
    -0.14
    raison
    -0.14
    POSITIVE LOGITS
     standpoint
    0.34
     perspective
    0.32
     outset
    0.25
     perspectives
    0.23
     beginning
    0.22
     Perspective
    0.21
    /to
    0.21
    oth
    0.21
    pers
    0.19
     depths
    0.19
    Act Density 0.079%

    No Known Activations