INDEX
Explanations
news-related entities like names of individuals, places, and organizations
legal terms and phrases associated with crime and legal proceedings
New Auto-Interp
Negative Logits
ĸļ
-0.75
uph
-0.72
©¶æ
-0.67
weekends
-0.66
purs
-0.65
everyday
-0.64
covenant
-0.64
ourselves
-0.64
ability
-0.63
¬¼
-0.63
POSITIVE LOGITS
ccording
1.15
According
1.04
Writing
1.03
Asked
1.02
Among
1.02
SPONSORED
1.00
However
0.99
Sources
0.98
But
0.98
Their
0.98
Activations Density 0.542%