INDEX
    Explanations

    sentences and phrases referencing artistic works, such as movies or books, emphasizing their titles and emotional impact

    New Auto-Interp
    Negative Logits
    -0.67
    <bos>
    -0.65
    .
    -0.55
    -
    -0.53
    -0.50
    ↵↵
    -0.49
    :
    -0.48
    ;
    -0.48
    <eos>
    -0.47
     of
    -0.46
    POSITIVE LOGITS
    ſelf
    0.93
    ^(@)
    0.92
    ".
    
    0.91
    вгений
    0.90
     otomatig
    0.89
    addCriterion
    0.89
     doubtnut
    0.87
     Majefty
    0.86
    ſelves
    0.85
     Meksiku
    0.84
    Act Density 0.541%

    No Known Activations