INDEX
Explanations
instances of the word "revision" and its related forms
New Auto-Interp
Negative Logits
orman
-0.16
strict
-0.15
pon
-0.15
erna
-0.14
ume
-0.14
Electronics
-0.14
acus
-0.14
cu
-0.14
uth
-0.13
veil
-0.13
POSITIVE LOGITS
928
0.16
numberWith
0.15
Willi
0.15
Mand
0.15
comed
0.14
oram
0.14
kip
0.14
Gear
0.14
olulu
0.14
_MAJOR
0.14
Activations Density 0.005%