INDEX
Explanations
discussions related to market imperfections and societal issues
statements highlighting societal flaws and criticisms
New Auto-Interp
Negative Logits
Dub
-0.65
ãĥĺ
-0.64
)]
-0.62
Springer
-0.60
çīĪ
-0.58
Originally
-0.57
Scroll
-0.57
Bund
-0.56
inder
-0.55
Chung
-0.55
POSITIVE LOGITS
imperfect
1.11
plenty
0.97
undeniably
0.96
certainly
0.95
occasionally
0.95
unavoid
0.90
exist
0.88
nevertheless
0.84
undoubtedly
0.84
nuanced
0.83
Activations Density 0.706%