INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.04
    1:0.03
    2:0.09
    3:0.28
    4:0.08
    5:0.04
    6:0.04
    7:0.05
    8:0.06
    9:0.07
    10:0.09
    11:0.08
    Negative Logits
     whoever
    -1.72
     Meow
    -1.41
     Lilly
    -1.39
     wherever
    -1.38
     motto
    -1.33
     whenever
    -1.33
     Amendments
    -1.33
     Elaine
    -1.33
    udos
    -1.29
     Nicola
    -1.29
    POSITIVE LOGITS
    lag
    1.62
    arching
    1.57
    alm
    1.54
    alties
    1.53
    mble
    1.51
    yrus
    1.50
    atro
    1.44
    icka
    1.43
    ć
    1.42
    asin
    1.42
    Act Density 0.000%

    No Known Activations