INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    kund
    -0.47
    red
    -0.47
     red
    -0.47
    teen
    -0.44
     mouth
    -0.42
     Fre
    -0.42
     fac
    -0.42
     kou
    -0.41
    })));
    -0.41
    stage
    -0.41
    POSITIVE LOGITS
     normaux
    0.84
     Majefty
    0.79
     voisins
    0.78
     sauvages
    0.78
     zelve
    0.77
     automatiques
    0.77
     asiatique
    0.76
     république
    0.76
     prisonniers
    0.72
     extérieurs
    0.71
    Act Density 0.039%

    No Known Activations