INDEX
    Explanations

    names or terms related to character identity or existence

    New Auto-Interp
    Head Attr Weights
    0:0.10
    1:0.02
    2:0.28
    3:0.09
    4:0.14
    5:0.06
    6:0.02
    7:0.02
    8:0.06
    9:0.09
    10:0.05
    11:0.02
    Negative Logits
    etheless
    -1.43
    anwhile
    -1.33
    sylv
    -1.20
    eric
    -1.19
     outp
    -1.19
    yrinth
    -1.17
    thora
    -1.14
    ollywood
    -1.14
     illum
    -1.12
    epad
    -1.12
    POSITIVE LOGITS
    interstitial
    1.40
    lein
    1.38
    akis
    1.32
    gaard
    1.30
    inger
    1.29
    Redditor
    1.23
    cki
    1.22
     nails
    1.21
     Bender
    1.20
    fold
    1.20
    Act Density 0.001%

    No Known Activations