INDEX
Explanations
adjectives that describe feelings or conditions
New Auto-Interp
Negative Logits
Äĥr
-0.16
istar
-0.15
eing
-0.14
eldig
-0.14
anio
-0.14
,—
-0.14
ouver
-0.14
addtogroup
-0.14
thern
-0.13
ollar
-0.13
POSITIVE LOGITS
-
0.16
Walsh
0.15
á»Ļi
0.15
–
0.14
616
0.14
736
0.14
’Brien
0.14
357
0.13
via
0.13
tang
0.13
Activations Density 0.000%