INDEX
Explanations
specific dates and locations in a news article
references to dates and news events
New Auto-Interp
Negative Logits
Kush
-0.96
wine
-0.96
Wad
-0.93
Sov
-0.91
Gaw
-0.87
Sov
-0.83
wor
-0.81
Williams
-0.80
ques
-0.79
Wilson
-0.78
POSITIVE LOGITS
itri
0.92
Pillar
0.85
Roberto
0.83
24
0.81
24
0.80
iston
0.74
Pri
0.74
Instit
0.73
iman
0.73
244
0.73
Activations Density 1.098%