INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     derog
    -0.73
    downs
    -0.67
    hler
    -0.67
    audio
    -0.65
    gae
    -0.64
    --------------------------------------------------------
    -0.64
     quotas
    -0.63
     retaliation
    -0.62
    friends
    -0.62
    itism
    -0.61
    POSITIVE LOGITS
    ancock
    0.81
     Gutenberg
    0.73
     Surviv
    0.72
     Pradesh
    0.71
    ogly
    0.70
    ospace
    0.70
     Awareness
    0.69
    phis
    0.67
     profession
    0.65
     Technician
    0.64
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.