INDEX
Explanations
news-related terms or headlines
statements characterized by strong adjectives or descriptors
New Auto-Interp
Negative Logits
assies
-0.71
rooms
-0.70
verty
-0.69
fame
-0.69
corridors
-0.69
Frames
-0.67
æ©
-0.67
formations
-0.67
constituencies
-0.67
SPONSORED
-0.65
POSITIVE LOGITS
spokesperson
0.84
ccording
0.83
researcher
0.77
example
0.75
spokesman
0.71
staffer
0.70
spokeswoman
0.69
refres
0.66
exception
0.66
therapist
0.65
Activations Density 0.467%