INDEX
Explanations
phrases that denote uniqueness or exceptional qualities
New Auto-Interp
Negative Logits
rome
-0.17
tera
-0.16
inel
-0.15
anela
-0.15
inke
-0.14
acer
-0.14
dg
-0.14
iskey
-0.14
Å¥
-0.14
Shame
-0.14
POSITIVE LOGITS
ordinary
0.44
ordinary
0.36
æĻ®éĢļ
0.35
usual
0.33
typical
0.33
обÑĭÑĩ
0.31
usual
0.31
Ordinary
0.31
normal
0.29
æĻ®éĢļ
0.28
Activations Density 0.075%