INDEX
    Explanations

    religious figures

    New Auto-Interp
    Negative Logits
     polis
    -0.07
     Sist
    -0.07
    letcher
    -0.06
    .dw
    -0.06
     lif
    -0.06
    erti
    -0.06
    -0.06
     bulld
    -0.06
     Bethesda
    -0.06
    ativ
    -0.06
    POSITIVE LOGITS
     originate
    0.08
    [rand
    0.07
    .asarray
    0.07
    cry
    0.07
    [_
    0.06
     hopeless
    0.06
     exciting
    0.06
    htag
    0.06
    Genres
    0.06
     direction
    0.06
    Act Density 0.012%

    No Known Activations