INDEX
Explanations
phrases that quantify elements related to social issues
New Auto-Interp
Negative Logits
.addTab
-0.15
@Resource
-0.14
itate
-0.14
ote
-0.14
اÙĦبØŃر
-0.14
lookahead
-0.14
605
-0.14
539
-0.13
itar
-0.13
êµ°ìļĶ
-0.13
POSITIVE LOGITS
the
0.16
iola
0.15
a
0.14
Cod
0.14
óst
0.14
incom
0.13
half
0.13
odian
0.13
several
0.13
som
0.13
Activations Density 0.252%