INDEX
    Explanations

    phrases indicating evidence or inference

    phrases indicating speculation or conjecture

    New Auto-Interp
    Negative Logits
    izons
    -0.78
    ategory
    -0.75
    ]+
    -0.69
    itement
    -0.68
    irling
    -0.68
    ced
    -0.68
    iling
    -0.67
     Airl
    -0.65
    avorite
    -0.65
    orem
    -0.65
    POSITIVE LOGITS
     probable
    0.82
     unclear
    0.77
    ãĤ¨
    0.75
     doubtful
    0.75
    imaru
    0.72
     unfair
    0.71
    BUS
    0.70
     abundantly
    0.68
     reasonable
    0.68
    Ī
    0.67
    Act Density 0.082%

    No Known Activations