INDEX
    Explanations

    concerns or expressions of worry within texts

    references to worries or issues

    New Auto-Interp
    Negative Logits
    nice
    -0.71
    ctors
    -0.71
    gall
    -0.69
    buff
    -0.67
    graph
    -0.64
    SW
    -0.62
    INAL
    -0.62
    cle
    -0.61
    Interview
    -0.60
    tiny
    -0.60
    POSITIVE LOGITS
    afety
    1.06
     concerns
    0.91
     regarding
    0.85
     Concern
    0.81
     raised
    0.80
     pertaining
    0.79
     relating
    0.78
    hooting
    0.78
     concern
    0.78
     arising
    0.76
    Act Density 0.030%

    No Known Activations