INDEX
    Explanations

    phrases related to controversial or sensitive topics such as incest

    references to incestuous relationships

    New Auto-Interp
    Negative Logits
    eah
    -0.91
    ¥µ
    -0.86
    ĪĴ
    -0.84
    yers
    -0.82
    Ĵ
    -0.81
    gd
    -0.78
    gments
    -0.77
    inion
    -0.77
    mers
    -0.76
    wer
    -0.75
    POSITIVE LOGITS
     Frankenstein
    1.11
     Kafka
    1.02
     Dracula
    0.98
     incest
    0.90
     Malfoy
    0.89
     Cullen
    0.88
     vampires
    0.85
     Franz
    0.85
     Cassandra
    0.83
    ãĤ¼ãĤ¦ãĤ¹
    0.81
    Act Density 0.027%

    No Known Activations