INDEX
Explanations
references to specific numerical codes and identifiers
New Auto-Interp
Negative Logits
lect
-0.18
egot
-0.17
sen
-0.16
Peace
-0.15
erus
-0.15
Policy
-0.15
senior
-0.14
LARI
-0.14
imbus
-0.14
ertext
-0.14
POSITIVE LOGITS
oog
0.15
ikal
0.15
/trunk
0.15
OKIE
0.15
adows
0.15
around
0.14
HEST
0.14
赤
0.14
Owner
0.13
ippers
0.13
Activations Density 0.035%