INDEX
    Explanations

    specific concepts or roles

    New Auto-Interp
    Negative Logits
    сам
    0.40
    this
    0.39
     нашего
    0.39
     our
    0.38
     overcomes
    0.37
    oliath
    0.36
     আমাদের
    0.36
    ология
    0.35
    사와
    0.35
    ilion
    0.34
    POSITIVE LOGITS
     Alguns
    0.54
     Certaines
    0.51
     bazı
    0.51
     Certains
    0.50
     Algunas
    0.47
     niektórych
    0.46
     Beberapa
    0.46
     frases
    0.46
     አንዳንድ
    0.45
     antaranya
    0.45
    Act Density 0.009%

    No Known Activations