INDEX
    Explanations

    names of political figures

    New Auto-Interp
    Negative Logits
    pter
    -0.87
    ources
    -0.70
    hend
    -0.70
    ship
    -0.69
    ships
    -0.69
    lance
    -0.68
    inus
    -0.66
    oslav
    -0.65
    effic
    -0.64
    occ
    -0.64
    POSITIVE LOGITS
    adesh
    0.61
     è£ıè
    0.61
     lett
    0.60
     Point
    0.55
     corpus
    0.54
     Literature
    0.54
     warr
    0.53
     punch
    0.53
    she
    0.52
     surv
    0.51
    Act Density 0.104%

    No Known Activations