INDEX
    Explanations

    words signaling contrast or opposition

    the word "meanwhile" and its contexts indicating ongoing or simultaneous actions

    New Auto-Interp
    Negative Logits
    uto
    -0.66
     straw
    -0.66
     Freddy
    -0.63
    isable
    -0.62
     Affordable
    -0.61
    lic
    -0.61
    """
    -0.61
     parole
    -0.60
     Columb
    -0.60
    enders
    -0.59
    POSITIVE LOGITS
    æ©Ł
    0.90
    ðĿ
    0.78
    ãĤ´ãĥ³
    0.73
    ctr
    0.72
    CLASSIFIED
    0.72
    eredith
    0.72
    åĤ
    0.70
    å¯
    0.70
    ô
    0.68
    ï¸ı
    0.68
    Act Density 0.010%

    No Known Activations