INDEX
    Explanations

    references to historical events and figures

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.16
    2:0.10
    3:0.03
    4:0.02
    5:0.05
    6:0.09
    7:0.06
    8:0.09
    9:0.08
    10:0.06
    11:0.20
    Negative Logits
    arters
    -1.63
     Patreon
    -1.28
    orable
    -1.24
     ASAP
    -1.21
     Chair
    -1.08
    ��
    -1.04
     Advertisement
    -1.03
     preferably
    -1.03
     Artificial
    -1.02
    arial
    -1.01
    POSITIVE LOGITS
    }}
    1.24
     vanished
    1.23
     perceive
    1.19
    eff
    1.17
    ynt
    1.15
    yll
    1.11
    ois
    1.11
    ect
    1.10
     perceived
    1.10
     perce
    1.09
    Act Density 0.027%

    No Known Activations