INDEX
    Explanations

    dialogue or direct speech with quotation marks

    direct quotations or speech within the text

    New Auto-Interp
    Negative Logits
    avorite
    -0.77
     triv
    -0.65
     penal
    -0.65
     guest
    -0.65
     notoriously
    -0.64
     weakened
    -0.63
     foreground
    -0.63
     affected
    -0.63
     prone
    -0.62
    âĹ¼
    -0.62
    POSITIVE LOGITS
    Oh
    0.93
    Hey
    0.92
    cow
    0.91
    Jesus
    0.91
    hey
    0.90
    I
    0.89
    nothing
    0.83
    Nothing
    0.81
    Fuck
    0.81
    Bring
    0.80
    Act Density 0.086%

    No Known Activations