INDEX
Explanations
phrases indicating dismissiveness or a lack of concern regarding various situations
New Auto-Interp
Negative Logits
ÑĨенÑĤÑĢа
-0.16
asser
-0.15
cle
-0.15
ments
-0.15
997
-0.15
ittel
-0.15
hlen
-0.13
Latitude
-0.13
633
-0.13
.scalablytyped
-0.13
POSITIVE LOGITS
cole
0.16
fabric
0.14
ks
0.14
nam
0.14
otton
0.14
kaz
0.14
mes
0.13
ield
0.13
="{!!0.13
vet
0.13
Activations Density 0.068%