INDEX
Explanations
expressions related to the seriousness of various issues or opinions
New Auto-Interp
Negative Logits
Äįer
-0.17
sov
-0.17
ven
-0.15
colo
-0.15
iven
-0.15
дов
-0.14
addle
-0.14
olate
-0.14
emie
-0.14
anou
-0.13
POSITIVE LOGITS
seriously
0.48
Seriously
0.36
serious
0.35
seriousness
0.29
Seriously
0.29
Serious
0.27
seri
0.25
serious
0.25
ser
0.21
váž
0.21
Activations Density 0.035%