INDEX
    Explanations

    statements or quotes within a text

    instances of numerical ratings or scores associated with evaluations

    New Auto-Interp
    Negative Logits
     bear
    -0.70
     instit
    -0.66
     trades
    -0.64
     Jinping
    -0.64
     citiz
    -0.63
     retreat
    -0.62
     dilig
    -0.62
     lifes
    -0.62
     diseng
    -0.62
     goods
    -0.61
    POSITIVE LOGITS
    Unlike
    0.98
    Instead
    0.98
    Both
    0.97
    ccording
    0.96
    Specifically
    0.95
    Asked
    0.94
    Earlier
    0.94
    Although
    0.93
    Though
    0.92
    They
    0.91
    Act Density 0.431%

    No Known Activations