INDEX
    Explanations

    phrases indicating persistence or inevitability in situations

    New Auto-Interp
    Negative Logits
    WND
    -0.17
    _simps
    -0.17
    adla
    -0.16
    vox
    -0.15
    ãĤ¤ãĥ¤
    -0.15
    AIT
    -0.14
    ewan
    -0.14
    UTO
    -0.14
    rowse
    -0.14
    zos
    -0.14
    POSITIVE LOGITS
     this
    0.41
     thus
    0.38
    è¿Ļæł·
    0.36
    this
    0.35
     asÃŃ
    0.35
    éĤ£æł·
    0.32
    thus
    0.31
     THAT
    0.31
     böyle
    0.31
     váºŃy
    0.30
    Act Density 0.363%

    No Known Activations