INDEX
    Explanations

    phrases indicating likelihood or possibility

    phrases indicating uncertainty or speculation

    New Auto-Interp
    Negative Logits
    rouse
    -0.75
    avorite
    -0.73
    orem
    -0.73
    iling
    -0.72
    otos
    -0.72
    irling
    -0.69
    ategory
    -0.69
    izons
    -0.68
    aving
    -0.68
    ]+
    -0.67
    POSITIVE LOGITS
     doubtful
    0.80
     unclear
    0.77
     probable
    0.74
    Ĥİ
    0.72
     unfair
    0.71
     abundantly
    0.66
    rils
    0.66
    ril
    0.64
    Frameworks
    0.64
     clear
    0.63
    Act Density 0.069%

    No Known Activations