INDEX
    Explanations

    proper nouns, especially names of people and places

    expressions of familial relationships and personal connections

    New Auto-Interp
    Negative Logits
     Briggs
    -0.95
     Simmons
    -0.88
     Slate
    -0.85
     Summers
    -0.85
     Whedon
    -0.84
     Rollins
    -0.81
     Scott
    -0.81
     Randall
    -0.80
     McKay
    -0.80
    Ohio
    -0.78
    POSITIVE LOGITS
     fulfil
    0.93
     unlaw
    0.92
     Daesh
    0.91
     Tanz
    0.91
    arij
    0.91
     Malays
    0.88
    )",
    0.87
    ijn
    0.86
     Juda
    0.86
     Tayyip
    0.84
    Act Density 1.759%

    No Known Activations