INDEX
    Explanations

    phrases related to emotional intensity or conflict

    mentions of drama in various contexts

    New Auto-Interp
    Negative Logits
    ovo
    -0.80
    aton
    -0.79
    haps
    -0.79
    initions
    -0.75
    ever
    -0.71
    laws
    -0.71
    oaded
    -0.71
    alties
    -0.70
    gling
    -0.70
    othing
    -0.69
    POSITIVE LOGITS
     drama
    1.25
     dramas
    1.08
     Drama
    1.03
     queens
    0.85
     unfold
    0.84
     resil
    0.83
     SQU
    0.77
     unfolds
    0.75
     opera
    0.75
    rama
    0.73
    Act Density 0.008%

    No Known Activations