INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    <bos>
    -2.40
    /*
    -0.70
    
    
    -0.66
    <?
    -0.65
     onView
    -0.60
    govina
    -0.60
    exitRule
    -0.59
     дописавши
    -0.58
    quine
    -0.57
    apnews
    -0.56
    POSITIVE LOGITS
     emphat
    1.77
     milf
    1.77
     madonna
    1.76
     affor
    1.69
     perfet
    1.66
     maneu
    1.59
     accla
    1.58
     stockholm
    1.57
     peppa
    1.57
     inev
    1.55
    Act Density 0.157%

    No Known Activations