INDEX
    Explanations

    words related to statements or opinions

    statements or assertions made by advocates or officials

    New Auto-Interp
    Negative Logits
     Himself
    -0.94
    ï¸
    -0.70
    ascript
    -0.67
    à¦
    -0.65
    SourceFile
    -0.65
    Lex
    -0.65
    crow
    -0.65
    ufact
    -0.64
    artist
    -0.63
    icut
    -0.63
    POSITIVE LOGITS
     they
    0.99
     majorities
    0.71
     there
    0.70
     theirs
    0.69
     otherwise
    0.69
     goodbye
    0.67
     it
    0.64
     that
    0.64
     alike
    0.63
     loopholes
    0.63
    Act Density 0.156%

    No Known Activations