INDEX
    Explanations

    phrases indicating the importance and relevance of information or requests

    New Auto-Interp
    Negative Logits
    ol
    -0.16
    olare
    -0.16
    ech
    -0.15
    ajo
    -0.15
    arian
    -0.15
    ive
    -0.15
    üz
    -0.15
    ird
    -0.15
    nod
    -0.15
    Votre
    -0.14
    POSITIVE LOGITS
     tome
    0.26
     Ø¥ÙĦÙĬ
    0.23
     unto
    0.21
     ÙĦدÙĬ
    0.20
     μαζί
    0.17
    velt
    0.17
    ="__
    0.17
     Ø¥ÙĦÙĬÙĩ
    0.16
     bagi
    0.16
    aney
    0.15
    Act Density 0.432%

    No Known Activations