INDEX
    Explanations

    references to a specific person's name

    New Auto-Interp
    Negative Logits
    netflix
    -0.80
     Dominion
    -0.72
    lift
    -0.71
    WARD
    -0.69
    jet
    -0.69
    jin
    -0.68
    hood
    -0.67
    cloth
    -0.64
    current
    -0.62
    Cause
    -0.62
    POSITIVE LOGITS
    ician
    1.00
    olit
    0.88
    ano
    0.87
    ancies
    0.83
    ary
    0.82
    inelli
    0.81
    opol
    0.81
    icians
    0.80
    eness
    0.76
    aly
    0.76
    Act Density 0.018%

    No Known Activations