INDEX
Explanations
the verb "being" in various contexts
New Auto-Interp
Negative Logits
urat
-0.17
plr
-0.17
ulis
-0.16
oup
-0.16
ertino
-0.16
%A
-0.15
igi
-0.15
scratch
-0.15
senal
-0.14
aylor
-0.14
POSITIVE LOGITS
istant
0.15
elden
0.15
OMEM
0.14
æ®
0.14
reserved
0.14
loub
0.13
ike
0.13
ago
0.13
gm
0.13
athi
0.13
Activations Density 0.017%