INDEX
    Explanations

    instances of the word "vor" and its variations, indicating a focus on prepositions or temporal references

    New Auto-Interp
    Negative Logits
    ë¦
    -0.16
    pton
    -0.15
    yc
    -0.15
    à¹Īà¸Ńà¸Ļ
    -0.14
    cms
    -0.14
    ynth
    -0.14
    esan
    -0.14
    iances
    -0.14
    ymoon
    -0.14
    gle
    -0.14
    POSITIVE LOGITS
     allem
    0.22
     Ort
    0.20
    arl
    0.20
    rang
    0.20
     Thorn
    0.16
    orage
    0.16
    rats
    0.16
    rag
    0.16
    her
    0.16
     unt
    0.16
    Act Density 0.008%

    No Known Activations