INDEX
Explanations
concepts related to balance in various contexts
New Auto-Interp
Negative Logits
lis
-0.16
ollah
-0.16
ensa
-0.15
oble
-0.15
Wol
-0.15
²
-0.14
lus
-0.14
antha
-0.14
kin
-0.13
zin
-0.13
POSITIVE LOGITS
balance
0.23
balance
0.20
(balance
0.19
Balance
0.18
-priced
0.16
-sama
0.16
tures
0.15
Ñĥди
0.15
OfString
0.15
Ñīи
0.15
Activations Density 0.060%