INDEX
    Explanations

    time-related phrases

    phrases that indicate future actions or plans

    New Auto-Interp
    Negative Logits
    Definition
    -0.75
    Liter
    -0.74
    ritical
    -0.73
    Roman
    -0.73
    bian
    -0.71
    usage
    -0.68
    Vers
    -0.67
    thodox
    -0.67
    catentry
    -0.66
    ullah
    -0.65
    POSITIVE LOGITS
     plenty
    0.70
    anmar
    0.68
    ï¸ı
    0.66
    ]}
    0.65
    ATURE
    0.63
     Delete
    0.62
     tune
    0.61
     Pablo
    0.60
     deleted
    0.60
     Adventures
    0.59
    Act Density 0.090%

    No Known Activations