INDEX
Explanations
instances of the verb "to be" in various forms
New Auto-Interp
Negative Logits
morgan
-0.17
ebek
-0.16
oš
-0.16
CHANT
-0.16
опол
-0.15
aalborg
-0.15
aravel
-0.15
ména
-0.15
Erotische
-0.15
osti
-0.15
POSITIVE LOGITS
given
0.18
475
0.17
462
0.17
prevailed
0.16
iro
0.16
brought
0.15
persistent
0.15
taken
0.15
told
0.15
forced
0.15
Activations Density 0.208%