INDEX
    Explanations

    confession-related phrases, such as "confessing," "confessed," and "confession."

    New Auto-Interp
    Negative Logits
    hyde
    -0.82
    VILLE
    -0.76
     Elves
    -0.74
    WOOD
    -0.71
    OHN
    -0.71
     horizontally
    -0.69
    tone
    -0.68
    hunter
    -0.67
    DAY
    -0.67
     Bard
    -0.67
    POSITIVE LOGITS
    ention
    1.41
    icit
    1.39
    icted
    1.39
    orted
    1.38
    ervation
    1.31
    idential
    1.31
    osed
    1.30
    iction
    1.30
    ortion
    1.28
    inent
    1.26
    Act Density 2.822%

    No Known Activations