INDEX
    Explanations

    contractions and auxiliary verbs indicating likelihood or necessity

    New Auto-Interp
    Negative Logits
    ATHER
    -0.16
    ãĥ¼ãĤ¯
    -0.15
    EEDED
    -0.15
    yh
    -0.15
    _DLL
    -0.14
     obs
    -0.14
    alah
    -0.14
    ather
    -0.14
    ullan
    -0.14
    úsqueda
    -0.14
    POSITIVE LOGITS
     they
    0.40
     we
    0.38
     it
    0.30
     они
    0.27
     вони
    0.27
     she
    0.27
     there
    0.27
     he
    0.27
    they
    0.26
     оно
    0.25
    Act Density 0.162%

    No Known Activations