INDEX
Explanations
topics or situations where people are divided or in disagreement
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨãĤ£
-0.77
ufact
-0.60
periphery
-0.58
unden
-0.56
axis
-0.56
Uncommon
-0.56
wagen
-0.54
Antar
-0.53
ongs
-0.51
Interstitial
-0.51
POSITIVE LOGITS
about
1.33
regarding
1.29
whether
1.25
ABOUT
1.18
concerning
1.15
about
1.13
About
1.01
how
0.97
whether
0.97
Regarding
0.95
Activations Density 0.265%