INDEX
    Explanations

    instances of the word "possibly" followed by other words

    tentative language indicating possibility or uncertainty

    New Auto-Interp
    Negative Logits
    ctions
    -0.92
    unes
    -0.90
    arthed
    -0.80
    ĸļ
    -0.77
    imet
    -0.77
    itute
    -0.75
    igers
    -0.74
    acas
    -0.74
    igger
    -0.74
    mire
    -0.73
    POSITIVE LOGITS
     even
    0.97
     sooner
    0.79
     someday
    0.73
     others
    0.71
     unsus
    0.70
     overtake
    0.69
     worse
    0.68
     optionally
    0.66
     eliminate
    0.65
     possibly
    0.64
    Act Density 0.087%

    No Known Activations