INDEX
Explanations
references to social issues and community impacts
New Auto-Interp
Negative Logits
firstly
-0.23
either
-0.21
Firstly
-0.19
æĹ¢
-0.18
either
-0.18
:
-0.17
nejen
-0.17
ãģ¾ãģļ
-0.17
både
-0.17
EITHER
-0.17
POSITIVE LOGITS
consequ
0.28
importantly
0.26
subsequent
0.26
overall
0.23
subsequently
0.22
consequently
0.22
others
0.21
/or
0.20
other
0.19
overall
0.19
Activations Density 1.479%