INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    The
    -1.12
    tagez
    -0.74
     épaules
    -0.74
     jambe
    -0.72
     spalle
    -0.70
     besök
    -0.70
     pulito
    -0.70
     bicchiere
    -0.67
     abbraccio
    -0.67
     jambes
    -0.66
    POSITIVE LOGITS
     following
    0.82
     U
    0.77
     majority
    0.67
     original
    0.66
     most
    0.65
     latter
    0.65
     first
    0.63
     term
    0.62
     same
    0.61
     main
    0.61
    Act Density 0.228%

    No Known Activations