INDEX
Explanations
instances of the word "be" and its various forms
New Auto-Interp
Negative Logits
mana
-0.17
esar
-0.17
leta
-0.15
legg
-0.15
ecal
-0.14
Pan
-0.14
çĽ
-0.14
pan
-0.14
trope
-0.14
quier
-0.13
POSITIVE LOGITS
beh
0.15
yet
0.15
ienne
0.15
ÏĥÏĦαν
0.15
akan
0.14
пов
0.14
iele
0.14
ols
0.14
jist
0.14
kli
0.14
Activations Density 0.000%