INDEX
Explanations
adjectives and nouns associated with boldness or significance
New Auto-Interp
Negative Logits
idal
-0.17
rawn
-0.17
chen
-0.15
MAS
-0.15
isel
-0.15
ảo
-0.15
erna
-0.15
hood
-0.15
pond
-0.15
¢åįķ
-0.14
POSITIVE LOGITS
ness
0.17
ienes
0.17
lessness
0.15
ful
0.15
tur
0.14
Bind
0.14
ole
0.14
avin
0.14
stad
0.14
yw
0.14
Activations Density 0.023%