INDEX
    Explanations

    phrases related to political and legal discussions

    New Auto-Interp
    Negative Logits
    */(
    -0.74
    orate
    -0.70
    istically
    -0.69
    istical
    -0.68
    ibilities
    -0.67
    aciously
    -0.67
     proble
    -0.66
    bably
    -0.66
    othal
    -0.64
    thal
    -0.63
    POSITIVE LOGITS
     ours
    0.84
     those
    0.78
    those
    0.74
    unts
    0.74
     Ray
    0.68
    çͰ
    0.67
    ounter
    0.65
    par
    0.63
     myself
    0.63
     Graves
    0.62
    Act Density 0.056%

    No Known Activations