INDEX
    Explanations

    dates and locations in news articles

    specific dates and numerical representations in the text

    New Auto-Interp
    Negative Logits
    etheless
    -0.76
     streng
    -0.72
    ioned
    -0.70
    ļéĨĴ
    -0.66
    Ͻ
    -0.65
    ldon
    -0.64
    everal
    -0.62
     suprem
    -0.61
    ensical
    -0.61
     confir
    -0.61
    POSITIVE LOGITS
     2018
    1.50
     2017
    1.48
     2014
    1.47
     2013
    1.45
     2012
    1.43
     2015
    1.42
     2008
    1.38
     2009
    1.38
     2016
    1.38
     2011
    1.37
    Act Density 0.051%

    No Known Activations