INDEX
    Explanations

    words related to intensity or strength

    words related to independence

    New Auto-Interp
    Negative Logits
    MQ
    -0.72
    veyard
    -0.68
    Ĥİ
    -0.65
    --+
    -0.65
    Gate
    -0.64
    BY
    -0.62
    calling
    -0.61
    culosis
    -0.61
    sonian
    -0.61
     Lumpur
    -0.60
    POSITIVE LOGITS
    ented
    1.07
    etermin
    1.01
    oled
    0.98
    irection
    0.89
    ents
    0.89
    iour
    0.85
    etr
    0.84
    ignant
    0.83
    inged
    0.83
    rawn
    0.83
    Act Density 0.031%

    No Known Activations