INDEX
Explanations
phrases related to news articles or report titles
formatting elements and structural components within the text
New Auto-Interp
Negative Logits
Samar
-0.77
leans
-0.73
Plane
-0.71
DERR
-0.67
ĪĴ
-0.67
ministic
-0.66
milo
-0.63
pora
-0.59
Parables
-0.59
ernel
-0.58
POSITIVE LOGITS
]"
1.02
]
0.96
][/
0.91
quote
0.89
inline
0.87
=]
0.86
][
0.84
%]
0.84
gallery
0.80
"]=>
0.78
Activations Density 0.064%