INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    overs
    -0.72
     predec
    -0.71
    amate
    -0.69
    phia
    -0.67
     hindsight
    -0.66
     careless
    -0.66
     arrang
    -0.65
     greedy
    -0.64
    turned
    -0.63
     heirs
    -0.63
    POSITIVE LOGITS
     07
    0.89
     05
    0.88
     06
    0.87
     Accessed
    0.85
     04
    0.85
     09
    0.85
     03
    0.83
     01
    0.82
     02
    0.82
     08
    0.81
    Act Density 0.033%

    No Known Activations