INDEX
    Explanations

    references to comparisons or analogies involving significant historical events or figures

    New Auto-Interp
    Head Attr Weights
    0:0.01
    1:0.03
    2:0.09
    3:0.08
    4:0.11
    5:0.03
    6:0.02
    7:0.37
    8:0.03
    9:0.03
    10:0.05
    11:0.11
    Negative Logits
    URE
    -1.89
    ivo
    -1.62
    alde
    -1.59
    itol
    -1.56
    duction
    -1.48
    otto
    -1.45
    esse
    -1.44
    URES
    -1.42
     Newsletter
    -1.42
    iott
    -1.42
    POSITIVE LOGITS
    verty
    1.78
     Huss
    1.60
    hov
    1.53
     pros
    1.52
     Kard
    1.41
     Gur
    1.40
     intens
    1.39
    ��
    1.38
     tatt
    1.36
     Caucas
    1.34
    Act Density 0.009%

    No Known Activations