INDEX
    Explanations

    references to individuals, particularly in the context of events or statements in news articles

    New Auto-Interp
    Negative Logits
    iseaux
    -0.50
     natürlichen
    -0.49
    })*/
    -0.49
    ancy
    -0.48
     poffible
    -0.48
    celes
    -0.48
    )!
    -0.48
     earthen
    -0.47
    δου
    -0.47
     hisz
    -0.46
    POSITIVE LOGITS
    twimg
    0.69
    UnsafeEnabled
    0.66
    nytimes
    0.59
    +#+#
    0.59
     GenerationType
    0.59
    Hozzá
    0.58
    Cyfeiriadau
    0.57
    Xna
    0.56
    SequentialGroup
    0.56
    WebVitals
    0.55
    Act Density 0.085%

    No Known Activations