INDEX
    Explanations

    occurrences of the word "this"

    New Auto-Interp
    Negative Logits
    mare
    -0.17
    rica
    -0.15
     sound
    -0.15
     others
    -0.15
     Mare
    -0.15
     Ort
    -0.14
    osl
    -0.14
    .bs
    -0.14
    tring
    -0.14
    lah
    -0.14
    POSITIVE LOGITS
    Ñħод
    0.16
    ensch
    0.15
    /th
    0.14
     Tactical
    0.14
    alu
    0.14
    ODY
    0.14
    elho
    0.14
    ffee
    0.14
    LLU
    0.14
    ewise
    0.14
    Act Density 0.152%

    No Known Activations