INDEX
    Explanations

    phrases related to explaining or describing something in detail

    New Auto-Interp
    Negative Logits
    ktop
    -0.75
    twitch
    -0.73
    tes
    -0.73
    uld
    -0.70
    imet
    -0.69
     wisely
    -0.66
    talk
    -0.63
    α
    -0.63
    usb
    -0.62
    eka
    -0.61
    POSITIVE LOGITS
     similarities
    0.69
     conformity
    0.68
    oran
    0.65
     exemplary
    0.65
    tons
    0.65
    isance
    0.62
     Heads
    0.60
     cooperative
    0.58
     executions
    0.58
    ãĥ£
    0.57
    Act Density 0.163%

    No Known Activations