INDEX
    Explanations

    attends to tokens that imply completion or reaching a goal from tokens indicating a lack or absence

    New Auto-Interp
    Head Attr Weights
    0:0.15
    1:0.11
    2:0.10
    3:0.11
    4:0.09
    5:0.04
    6:0.18
    7:0.17
    Negative Logits
     viewDidLoad
    -0.32
     astore
    -0.31
    <eos>
    -0.31
    PRNewswire
    -0.27
    manni
    -0.27
    ticularly
    -0.27
    icrous
    -0.25
    -0.25
    raquo
    -0.25
     hoạch
    -0.24
    POSITIVE LOGITS
     myſelf
    0.63
     itſelf
    0.53
     himſelf
    0.52
     Monfieur
    0.52
     themſelves
    0.50
     reaſon
    0.50
     uſed
    0.49
     ſtate
    0.48
     pleaſure
    0.47
     Majefty
    0.47
    Act Density 0.046%

    No Known Activations