INDEX
    Explanations

    phrases related to personal stories or narratives

    New Auto-Interp
    Negative Logits
    <bos>
    -2.02
     vainly
    -1.08
    -0.96
     miscon
    -0.90
     merrily
    -0.89
    
    
    -0.87
     effectually
    -0.86
    /*!
    
    -0.85
     triumphantly
    -0.85
     nobly
    -0.84
    POSITIVE LOGITS
    ly
    1.14
     tramont
    1.00
     Luglio
    0.94
     kasa
    0.94
     dott
    0.93
     kac
    0.92
     pank
    0.90
     umo
    0.89
     ristor
    0.89
     saar
    0.88
    Act Density 1.682%

    No Known Activations