INDEX
    Explanations

    statements related to political issues and international relations

    New Auto-Interp
    Negative Logits
     lifes
    -0.84
     nightly
    -0.75
     hero
    -0.73
     sucker
    -0.71
     stray
    -0.70
     bear
    -0.69
     instinct
    -0.69
     crush
    -0.69
     dolphin
    -0.68
     nomine
    -0.67
    POSITIVE LOGITS
    Finally
    1.76
    Regarding
    1.72
    Lastly
    1.70
    Furthermore
    1.69
    Moreover
    1.69
    However
    1.68
    Ultimately
    1.66
    Additionally
    1.63
    Nevertheless
    1.61
    Similarly
    1.59
    Act Density 0.476%

    No Known Activations