INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     it
    1.38
    t
    1.27
     I
    1.23
    c
    1.10
     идеи
    1.09
     are
    0.96
     integra
    0.96
     interessi
    0.94
     appels
    0.94
    s
    0.94
    POSITIVE LOGITS
    ↵↵
    1.45
    us
    1.23
    م
    1.23
    1.23
    ти
    1.21
    1.19
    ى
    1.19
    to
    1.05
    1.05
    м
    1.04
    Act Density 0.050%

    No Known Activations