INDEX
    Explanations

    occurrences of the word "the."

    New Auto-Interp
    Negative Logits
    SharedDtor
    -0.86
    parsedMessage
    -0.85
    featureID
    -0.84
    OGND
    -0.83
    fromnode
    -0.82
    Personendaten
    -0.77
    -0.73
    <unused14>
    -0.73
    <unused41>
    -0.72
    <unused79>
    -0.72
    POSITIVE LOGITS
     the
    1.16
    The
    1.13
     THE
    1.09
     The
    1.01
    THE
    0.92
    the
    0.89
     their
    0.63
    ethe
    0.59
    sthe
    0.57
    OfThe
    0.57
    Act Density 0.006%

    No Known Activations