INDEX
Explanations
adjectives denoting strong characteristics or actions
the word "bold" and its variations
New Auto-Interp
Negative Logits
Cheong
-0.95
OTOS
-0.87
UTERS
-0.71
yer
-0.65
ADS
-0.64
ersion
-0.64
UT
-0.64
rera
-0.64
enfranch
-0.62
rogens
-0.62
POSITIVE LOGITS
bold
1.09
bold
0.99
faced
0.96
face
0.95
Ital
0.90
er
0.88
ness
0.87
itude
0.82
olini
0.79
ital
0.74
Activations Density 0.012%