INDEX
Explanations
phrases and expressions that reflect a strong opinion or sentiment
New Auto-Interp
Negative Logits
ĶåĽŀ
-0.15
nder
-0.14
½æķ°
-0.14
ród
-0.14
ULD
-0.13
719
-0.13
247
-0.13
ernals
-0.13
581
-0.13
беÑĢ
-0.13
POSITIVE LOGITS
ÙĦت
0.15
caled
0.15
itzer
0.15
bn
0.14
dong
0.14
каз
0.14
hong
0.14
ekk
0.13
onsense
0.13
-cent
0.13
Activations Density 0.068%