INDEX
Explanations
contrastive conjunctions and adjectives that highlight qualities or notable characteristics
New Auto-Interp
Negative Logits
aggi
-0.17
isku
-0.15
emu
-0.14
ulas
-0.14
.Ptr
-0.14
lec
-0.14
лек
-0.14
ityEngine
-0.13
besides
-0.13
bast
-0.13
POSITIVE LOGITS
owo
0.16
ooks
0.16
agne
0.15
anning
0.15
panion
0.15
acks
0.14
eyh
0.14
.semantic
0.14
ylene
0.14
umber
0.13
Activations Density 0.034%