INDEX
    Explanations

    adverbs or adjectives that express certainty or comparison in comparison to something else

    words and phrases indicating frequency or temporal aspects

    New Auto-Interp
    Negative Logits
     Seym
    -0.81
     Azerb
    -0.62
    ollar
    -0.60
     Coke
    -0.57
     paramilitary
    -0.57
    perty
    -0.56
    bledon
    -0.55
    istani
    -0.54
     disobedience
    -0.54
     Princ
    -0.51
    POSITIVE LOGITS
     ];
    0.69
     refers
    0.69
     Released
    0.67
    iverse
    0.66
     ]
    0.65
     =>
    0.65
    20439
    0.65
     Asked
    0.64
    malink
    0.64
     Adds
    0.64
    Act Density 0.147%

    No Known Activations