INDEX
    Explanations

    exclamatory phrases and expressions of surprise or emphasis

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.06
    3:0.08
    4:0.16
    5:0.04
    6:0.06
    7:0.25
    8:0.05
    9:0.05
    10:0.07
    11:0.10
    Negative Logits
    ワン
    -1.73
    pert
    -1.46
    unte
    -1.36
    vere
    -1.31
    ACTION
    -1.30
     Orig
    -1.29
    andom
    -1.28
    essor
    -1.27
     actionGroup
    -1.27
    fle
    -1.26
    POSITIVE LOGITS
    LOS
    1.51
     gaps
    1.45
    ija
    1.45
     trickle
    1.44
     glaring
    1.43
     heavens
    1.36
    lasses
    1.35
    anwhile
    1.32
     backwards
    1.28
     footsteps
    1.27
    Act Density 0.001%

    No Known Activations