INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Agra
    -0.70
     Boko
    -0.61
     Pelop
    -0.60
     contactar
    -0.60
     IAEA
    -0.59
     Winslow
    -0.58
     crossword
    -0.57
     Sega
    -0.57
     Keats
    -0.56
    homonymie
    -0.56
    POSITIVE LOGITS
     the
    0.87
    "):
    
    0.79
    0.78
    "]
    
    0.78
    AlterField
    0.76
    )";
    
    0.76
    "])
    
    0.76
    )");
    
    0.76
    }{*}{
    0.74
    )];
    
    0.74
    Act Density 0.365%

    No Known Activations