INDEX
    Explanations

    the conjunctions and question words within a context

    New Auto-Interp
    Negative Logits
     Kaynak
    -0.17
    ood
    -0.17
    ı
    -0.15
    its
    -0.15
    ully
    -0.15
     Mayer
    -0.14
    PG
    -0.14
    ula
    -0.14
     dex
    -0.14
     вов
    -0.14
    POSITIVE LOGITS
     ifs
    0.15
    央
    0.15
    agan
    0.15
    à¥įयत
    0.15
    rá
    0.15
    rance
    0.15
     manner
    0.14
    rud
    0.14
     ÚĨÚ¯ÙĪÙĨÙĩ
    0.14
    adies
    0.14
    Act Density 0.020%

    No Known Activations