INDEX
    Explanations

    abstract concepts related to formality and structure

    terms related to forms and characteristics of reality

    New Auto-Interp
    Negative Logits
    kers
    -0.75
    vier
    -0.74
    fman
    -0.74
    Ͻ
    -0.73
    ERC
    -0.71
    secut
    -0.70
    kef
    -0.69
    ergy
    -0.69
    rompt
    -0.68
    vag
    -0.67
    POSITIVE LOGITS
    istically
    0.74
    atsu
    0.70
    butt
    0.68
    heads
    0.66
     Explosion
    0.64
    hound
    0.64
    ativity
    0.63
     Cloak
    0.63
    ipop
    0.61
    aries
    0.61
    Act Density 0.019%

    No Known Activations