INDEX
Explanations
references to specific names or entities, possibly related to news articles or events
keywords related to features and functionalities of applications
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.08
3:0.08
4:0.08
5:0.07
6:0.07
7:0.08
8:0.09
9:0.08
10:0.09
11:0.08
Negative Logits
��
-1.45
fide
-1.38
Required
-1.26
ashion
-1.25
lawful
-1.22
iably
-1.21
enters
-1.20
ophone
-1.19
Sto
-1.15
virtue
-1.14
POSITIVE LOGITS
advertisement
1.52
obos
1.50
nings
1.48
atl
1.45
)]
1.41
amples
1.33
Authors
1.32
ulz
1.31
mining
1.30
Cosponsors
1.28
Activations Density 0.000%