INDEX
    Explanations

    mentions of characters' names in a dialogue or interview setup

    New Auto-Interp
    Negative Logits
    <bos>
    -2.59
    
    
    -0.70
     uzyskać
    -0.68
    /***
    
    -0.68
    -0.67
    <?
    
    -0.66
    <!--
    
    -0.62
    /*
    -0.61
    <?
    -0.59
     springfox
    -0.59
    POSITIVE LOGITS
     eiffel
    1.10
     cartier
    1.03
     stockholm
    0.99
     indestru
    0.99
     umbre
    0.95
     madonna
    0.95
     affor
    0.93
     effe
    0.92
     ecru
    0.91
     imposs
    0.91
    Act Density 0.736%

    No Known Activations