INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    paren
    -0.82
    Height
    -0.82
    eln
    -0.79
    raq
    -0.78
    fur
    -0.77
    Jews
    -0.74
    uf
    -0.73
    smoking
    -0.71
    anyahu
    -0.69
    byn
    -0.69
    POSITIVE LOGITS
     COVER
    0.68
     SWAT
    0.64
     Violet
    0.63
     Cutter
    0.62
     Spartans
    0.61
     Simone
    0.60
     ACTION
    0.60
    artz
    0.59
     PERSON
    0.59
     Lansing
    0.59
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.