INDEX
Explanations
topics related to cultural and social issues
New Auto-Interp
Negative Logits
nues
-0.17
@student
-0.16
asaki
-0.15
ukes
-0.15
iyon
-0.15
.react
-0.15
unda
-0.14
NEL
-0.14
Jo
-0.14
bage
-0.14
POSITIVE LOGITS
such
0.37
such
0.30
like
0.27
likes
0.24
including
0.24
:
0.23
SUCH
0.21
include
0.20
Such
0.19
å¦Ĥ
0.19
Activations Density 0.102%