INDEX
    Explanations

    prepositions and conjunctions indicating relationships and connections between entities

    New Auto-Interp
    Negative Logits
    ä¸ŃçļĦ
    -0.19
    ä¸ĬçļĦ
    -0.16
    liches
    -0.15
     dice
    -0.15
     Produkte
    -0.15
     лÑİдей
    -0.14
    cka
    -0.14
    ogra
    -0.14
    šku
    -0.14
     Von
    -0.14
    POSITIVE LOGITS
     dem
    0.44
     einem
    0.43
    dem
    0.34
     einer
    0.32
     seinem
    0.31
     diesem
    0.30
    inem
    0.28
     ihrem
    0.26
     der
    0.25
    DEM
    0.24
    Act Density 0.031%

    No Known Activations