INDEX
    Explanations

    phrases related to speaking up or speaking out on various issues or behalf of others

    New Auto-Interp
    Negative Logits
     depic
    -0.95
     guarante
    -0.90
     ?...
    -0.86
     accla
    -0.86
     encomp
    -0.85
     increa
    -0.83
     desir
    -0.80
     fta
    -0.80
     fuf
    -0.80
     »>
    -0.79
    POSITIVE LOGITS
     speak
    0.74
     loud
    0.70
     louder
    0.70
     spoken
    0.69
     voice
    0.67
     mouth
    0.63
     speaking
    0.62
     voices
    0.61
    spoken
    0.61
     aloud
    0.60
    Act Density 0.446%

    No Known Activations