INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    roman
    -0.83
    ourgeois
    -0.79
    senal
    -0.76
     defensively
    -0.67
     rife
    -0.66
     abound
    -0.65
    arc
    -0.65
    antage
    -0.64
     =================
    -0.63
    fusc
    -0.63
    POSITIVE LOGITS
    atri
    0.81
     Truman
    0.66
    Tel
    0.66
     Siri
    0.62
    IRO
    0.62
     husbands
    0.62
     sang
    0.61
    arta
    0.61
    IPP
    0.60
     Independence
    0.59
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.