INDEX
    Explanations

    references to specific locations or geographic names

    Followed by non-English text

    Romance language greetings or questions

    New Auto-Interp
    Negative Logits
     démocr
    -0.98
     himſelf
    -0.97
     financières
    -0.96
     vectorielles
    -0.94
     complètes
    -0.94
     químicas
    -0.92
     présentes
    -0.91
     commerciales
    -0.90
     destinées
    -0.88
     genoux
    -0.88
    POSITIVE LOGITS
     themselves
    0.51
     controllers
    0.51
     titles
    0.49
     ones
    0.49
     examples
    0.48
     שונים
    0.47
     those
    0.47
     primers
    0.47
     types
    0.47
     eds
    0.46
    Act Density 0.021%

    No Known Activations