INDEX
    Explanations

    phrases indicating location or presence within sentences

    New Auto-Interp
    Negative Logits
    xing
    -0.17
    iki
    -0.16
    eson
    -0.16
    ly
    -0.16
    ai
    -0.16
    elia
    -0.16
    allet
    -0.15
    aring
    -0.15
    per
    -0.15
    lo
    -0.15
    POSITIVE LOGITS
    voor
    0.19
     least
    0.19
    assis
    0.17
    ENA
    0.16
    orthand
    0.15
    azzo
    0.14
    inati
    0.14
    664
    0.14
    oldown
    0.14
    onet
    0.14
    Act Density 0.009%

    No Known Activations