INDEX
    Explanations

    numerical values and references to specific years or dates

    New Auto-Interp
    Negative Logits
     neighb
    -0.77
     derby
    -0.75
     swall
    -0.74
     dubbed
    -0.71
     edges
    -0.71
     redes
    -0.70
     relegation
    -0.70
     tram
    -0.68
     betting
    -0.68
     bundled
    -0.68
    POSITIVE LOGITS
    "â̦
    1.75
    "[
    1.69
    "...
    1.67
    "'
    1.63
    Dear
    1.61
    Quote
    1.56
    "
    1.45
    "(
    1.42
    When
    1.39
    Hello
    1.38
    Act Density 0.107%

    No Known Activations