INDEX
Explanations
forms of the verb "to be" in various contexts
New Auto-Interp
Negative Logits
lust
-0.19
zag
-0.16
atism
-0.16
anel
-0.16
ecies
-0.15
pants
-0.15
b
-0.15
icz
-0.15
827
-0.15
atica
-0.14
POSITIVE LOGITS
plitude
0.27
ERICAN
0.23
putation
0.22
plit
0.22
iable
0.21
munition
0.21
oxic
0.20
eba
0.19
oba
0.19
igos
0.19
Activations Density 0.029%