INDEX
Explanations
terms related to gentility and its associated concepts
New Auto-Interp
Negative Logits
edor
-0.16
eries
-0.15
CTION
-0.15
iedades
-0.15
arding
-0.15
yers
-0.15
SELL
-0.15
šť
-0.15
ptime
-0.15
íĥĪ
-0.14
POSITIVE LOGITS
lemen
0.32
gent
0.27
gent
0.24
Gent
0.24
lem
0.22
leness
0.21
lemn
0.20
amic
0.20
ile
0.19
gentle
0.18
Activations Density 0.008%