INDEX
Explanations
instances of the verb "to be" in various forms
New Auto-Interp
Negative Logits
.selenium
-0.17
orna
-0.16
opak
-0.14
coz
-0.14
riend
-0.14
von
-0.14
medi
-0.13
happy
-0.13
uraa
-0.13
etter
-0.13
POSITIVE LOGITS
itself
0.16
thirsty
0.15
tant
0.15
ixo
0.15
ohen
0.15
.done
0.15
PED
0.14
дело
0.14
-done
0.14
aken
0.14
Activations Density 0.111%