INDEX
    Explanations

    sentences with question marks indicating a query or uncertainty

    New Auto-Interp
    Negative Logits
    places
    -0.68
    chairs
    -0.67
    ipel
    -0.65
    runners
    -0.65
    gra
    -0.64
    ilan
    -0.64
    chat
    -0.61
    forts
    -0.60
    agra
    -0.59
     mosqu
    -0.59
    POSITIVE LOGITS
     Thou
    0.87
     thou
    0.82
     anybody
    0.78
     nobody
    0.73
     anyone
    0.72
     there
    0.70
    )),
    0.67
    fortunately
    0.66
     YOU
    0.66
     it
    0.65
    Act Density 0.052%

    No Known Activations