INDEX
    Explanations

    attends to tokens marked with numerical values from tokens marked with square brackets indicating their placement in a sequence

    New Auto-Interp
    Head Attr Weights
    0:0.15
    1:0.12
    2:0.10
    3:0.11
    4:0.11
    5:0.10
    6:0.11
    7:0.17
    Negative Logits
    <eos>
    -0.27
    :
    -0.26
     age
    -0.25
     honor
    -0.22
     with
    -0.22
     distância
    -0.22
     tej
    -0.22
    onomy
    -0.21
     població
    -0.21
     (
    -0.21
    POSITIVE LOGITS
     itſelf
    0.55
     Efq
    0.50
     myſelf
    0.49
    ſelves
    0.43
     pleaſure
    0.43
     Reſ
    0.42
     unſ
    0.42
     Monfieur
    0.41
     ſy
    0.41
    ſelf
    0.41
    Act Density 0.141%

    No Known Activations