INDEX
Explanations
references to specific days, times, and news-related entities
New Auto-Interp
Head Attr Weights
0:0.03
1:0.02
2:0.02
3:0.07
4:0.06
5:0.17
6:0.02
7:0.10
8:0.02
9:0.36
10:0.02
11:0.05
Negative Logits
Cause
-2.02
nutshell
-2.01
={-2.01
begin
-1.99
Plot
-1.98
proclaim
-1.94
izo
-1.84
cause
-1.84
evolves
-1.80
incent
-1.78
POSITIVE LOGITS
herself
2.48
himself
2.36
ELS
2.10
yesterday
2.10
ONDON
2.09
she
2.07
his
2.06
EPA
1.99
龍喚士
1.98
separately
1.98
Activations Density 0.007%