INDEX
    Explanations

    words related to comments, explanations, and interactions

    New Auto-Interp
    Negative Logits
    selves
    -0.64
    common
    -0.64
    hub
    -0.63
    aura
    -0.60
     unison
    -0.59
     Composite
    -0.58
    emale
    -0.56
    ogether
    -0.54
    avia
    -0.54
     collective
    -0.53
    POSITIVE LOGITS
     himself
    1.17
     Himself
    0.80
     thence
    0.65
     lect
    0.64
     personally
    0.64
     his
    0.63
     solo
    0.62
    imaru
    0.60
     remorse
    0.60
     resign
    0.60
    Act Density 0.487%

    No Known Activations