INDEX
    Explanations

    words related to causation and outcome

    phrases that indicate consequences or effects

    New Auto-Interp
    Negative Logits
    ĸļ
    -0.73
     Jur
    -0.66
     Bastard
    -0.62
     Kun
    -0.60
     Wend
    -0.60
     Pants
    -0.58
     Kush
    -0.58
    raint
    -0.57
     Sut
    -0.57
     Sabb
    -0.57
    POSITIVE LOGITS
    depending
    0.80
     depending
    0.78
     tricky
    0.72
    gettable
    0.72
     safely
    0.69
    odder
    0.68
    ESE
    0.68
     anywhere
    0.67
     easily
    0.64
    pole
    0.64
    Act Density 0.300%

    No Known Activations