INDEX
    Explanations

    prepositions or relational words

    New Auto-Interp
    Negative Logits
    orous
    0.50
     opted
    0.48
    ites
    0.46
    仕事
    0.45
     দেখার
    0.45
    0.45
    ischen
    0.44
    eningkatan
    0.43
    iciens
    0.42
     intitul
    0.42
    POSITIVE LOGITS
     solely
    0.64
     towar
    0.60
     within
    0.59
     WITHIN
    0.57
     nejen
    0.56
     from
    0.56
     către
    0.56
     WITHOUT
    0.55
     toward
    0.54
    ใน
    0.54
    Act Density 0.094%

    No Known Activations