INDEX
    Explanations

    significant nouns and verbs related to objectives and actions

    New Auto-Interp
    Negative Logits
    aram
    -0.16
     thiên
    -0.15
    abox
    -0.14
    /inet
    -0.14
    ausal
    -0.14
    oste
    -0.14
    ANI
    -0.14
    ects
    -0.13
    extract
    -0.13
    ervers
    -0.13
    POSITIVE LOGITS
     dafür
    0.20
     dazu
    0.19
    åħ·ä½ĵ
    0.16
    ulla
    0.15
    زÙĨ
    0.14
    ettes
    0.14
     Wy
    0.14
     attendant
    0.14
    519
    0.14
     reversal
    0.13
    Act Density 0.026%

    No Known Activations