INDEX
    Explanations

    prepositions and conjunctions

    New Auto-Interp
    Negative Logits
     وق
    -0.07
     dominate
    -0.07
     tapes
    -0.06
     videot
    -0.06
     thrift
    -0.06
    ingo
    -0.06
     bombing
    -0.06
     Shame
    -0.06
     lengthy
    -0.06
    -0.06
    POSITIVE LOGITS
    outdir
    0.07
    viewport
    0.07
     معن
    0.07
     meats
    0.07
    MN
    0.06
    กรรม
    0.06
    Ke
    0.06
    disable
    0.06
     deutsche
    0.06
     werden
    0.06
    Act Density 0.111%

    No Known Activations