INDEX
    Explanations

    prepositions and articles

    New Auto-Interp
    Negative Logits
     same
    -0.67
     Yoshida
    -0.63
     latter
    -0.62
     pula
    -0.61
     intere
    -0.59
    ौर
    -0.59
     Claude
    -0.58
     model
    -0.57
     more
    -0.57
     work
    -0.56
    POSITIVE LOGITS
     the
    1.17
    GraphicsUnit
    1.03
    +#+
    0.98
    __(/*!
    0.96
    +#+#
    0.96
     consultato
    0.90
    PositiveButton
    0.89
    Portail
    0.86
     את
    0.85
     ristoranti
    0.85
    Act Density 0.528%

    No Known Activations