INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     nim
    -0.07
    ieron
    -0.06
    oc
    -0.06
     comentarios
    -0.06
    -0.06
    tt
    -0.06
    -0.06
     duplicate
    -0.06
     Pax
    -0.06
     initiate
    -0.06
    POSITIVE LOGITS
    _ASSUME
    0.07
     France
    0.07
    MenuItem
    0.06
     Quebec
    0.06
    oulouse
    0.06
    ozy
    0.06
    illard
    0.06
     بسیاری
    0.06
    (robot
    0.06
    être
    0.06
    Act Density 0.367%

    No Known Activations