INDEX
Explanations
mentions of sources and information credibility
New Auto-Interp
Negative Logits
è¿İ
-0.14
_DECLARE
-0.13
linger
-0.13
opus
-0.13
ampion
-0.13
chas
-0.13
оже
-0.12
arser
-0.12
اÙĦÙĩ
-0.12
gaard
-0.12
POSITIVE LOGITS
sources
0.42
source
0.34
anonymous
0.33
anonymously
0.33
sources
0.32
insiders
0.32
knowledgeable
0.32
familiar
0.32
Sources
0.32
confidential
0.32
Activations Density 0.085%