INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     sèche
    -0.84
     étoit
    -0.82
     détruit
    -0.79
     gloire
    -0.79
     extremamente
    -0.78
     fumée
    -0.78
     prêtres
    -0.78
     excès
    -0.78
     extremadamente
    -0.76
     détru
    -0.75
    POSITIVE LOGITS
    ally
    0.76
    ing
    0.69
    ting
    0.65
    ly
    0.60
    ning
    0.57
    izing
    0.54
    iting
    0.53
    iner
    0.52
     round
    0.50
    intios
    0.49
    Act Density 0.076%

    No Known Activations