INDEX
    Explanations

    expressions of surprise or astonishment

    New Auto-Interp
    Head Attr Weights
    0:0.05
    1:0.05
    2:0.18
    3:0.12
    4:0.03
    5:0.03
    6:0.14
    7:0.11
    8:0.06
    9:0.05
    10:0.06
    11:0.07
    Negative Logits
    fw
    -1.55
    ctors
    -1.41
    sembly
    -1.41
    adies
    -1.40
     showc
    -1.35
     traged
    -1.32
     livest
    -1.31
     tyr
    -1.29
    usterity
    -1.29
    ModLoader
    -1.28
    POSITIVE LOGITS
     mole
    1.25
     istg
    1.16
     housed
    1.14
     crosses
    1.11
     Moment
    1.08
     peak
    1.08
    龍契士
    1.06
    eta
    1.01
     Blanc
    1.01
     Bethlehem
    1.00
    Act Density 0.018%

    No Known Activations