INDEX
    Explanations

    words related to political parties and historical figures

    New Auto-Interp
    Negative Logits
     affez
    -0.86
     rispond
    -0.79
    <bos>
    -0.76
     riemp
    -0.75
     allarg
    -0.73
     isolato
    -0.72
    <?
    
    -0.71
    🕗
    -0.70
     cammin
    -0.69
     dicono
    -0.68
    POSITIVE LOGITS
     strto
    0.51
     Koning
    0.49
    -
    0.47
    ity
    0.47
    ism
    0.46
    ous
    0.45
    ("")
    
    0.45
    '
    0.44
    (:)
    0.43
    0.43
    Act Density 0.927%

    No Known Activations