INDEX
    Explanations

    references to specific years and numerical data related to events or statistics

    New Auto-Interp
    Head Attr Weights
    0:0.10
    1:0.02
    2:0.03
    3:0.07
    4:0.18
    5:0.17
    6:0.04
    7:0.01
    8:0.08
    9:0.05
    10:0.01
    11:0.17
    Negative Logits
    undrum
    -1.92
     deity
    -1.89
     torch
    -1.79
    Assembly
    -1.76
    Ancient
    -1.75
     heavenly
    -1.72
     Flavoring
    -1.71
    Redditor
    -1.70
    Blu
    -1.70
    Chicken
    -1.69
    POSITIVE LOGITS
     meanwhile
    2.36
     thereafter
    2.08
     incidents
    2.06
     surveys
    2.03
    SPONSORED
    2.01
     Sessions
    1.98
     interns
    1.96
     onwards
    1.95
     Carroll
    1.90
     however
    1.84
    Act Density 0.002%

    No Known Activations