INDEX
    Explanations

    phrases indicating conditions or requirements before a specific action is taken

    the word "only" in various contexts, often indicating limitations or conditions

    New Auto-Interp
    Negative Logits
    acca
    -0.75
    aston
    -0.75
    ongyang
    -0.74
    Mods
    -0.73
    WT
    -0.73
    arez
    -0.72
    ott
    -0.71
    redit
    -0.68
    ãĤ¦ãĤ¹
    -0.68
    UD
    -0.68
    POSITIVE LOGITS
    etheless
    1.07
     nonetheless
    1.05
     nevertheless
    0.82
     overshadowed
    0.81
     importantly
    0.80
     retains
    0.79
     retained
    0.73
     succeeded
    0.73
     persists
    0.72
     remains
    0.71
    Act Density 0.135%

    No Known Activations