INDEX
Explanations
adjectives that describe the quality or characteristics of various subjects
New Auto-Interp
Negative Logits
ASN
-0.16
ABCDEFG
-0.16
ména
-0.15
UILTIN
-0.15
abcdef
-0.15
aco
-0.15
onde
-0.14
619
-0.14
AS
-0.14
ãĥ¼ãĥľ
-0.14
POSITIVE LOGITS
as
0.63
als
0.34
sebagai
0.30
как
0.29
ä½ľä¸º
0.26
ÏīÏĤ
0.26
jako
0.25
as
0.24
Ñıк
0.24
çĤº
0.23
Activations Density 0.079%