INDEX
Explanations
highlights and key points in news stories
sections that summarize key points or highlights from stories or articles
New Auto-Interp
Negative Logits
nurs
-0.85
unci
-0.79
ogly
-0.78
ŃĶ
-0.75
tails
-0.72
abase
-0.70
olesc
-0.68
ander
-0.67
unt
-0.67
andro
-0.67
POSITIVE LOGITS
WATCHED
0.80
ï
0.75
Surveillance
0.73
Researchers
0.72
Decoder
0.71
Highlights
0.71
Kaine
0.71
Transcript
0.70
Videos
0.70
Legislation
0.69
Activations Density 0.016%