INDEX
    Explanations

    proper nouns and phrases related to questioning or debating aspects of a topic

    references to specific entities and criticisms of government and institutions

    New Auto-Interp
    Negative Logits
    .�
    -0.66
    .<
    -0.63
    oven
    -0.61
    };
    -0.60
    .''
    -0.60
    ŃĶ
    -0.59
    .;
    -0.58
    ģ«
    -0.57
    >.
    -0.57
    `.
    -0.56
    POSITIVE LOGITS
     exists
    0.93
     fails
    0.93
     couldn
    0.92
     lacks
    0.87
     might
    0.86
     shines
    0.86
     hasn
    0.85
     refuses
    0.85
     suddenly
    0.84
     behaves
    0.84
    Act Density 0.532%

    No Known Activations