INDEX
Explanations
numerical data and formatting related to publications or reports
New Auto-Interp
Negative Logits
ειο
-0.16
metal
-0.16
sor
-0.15
iou
-0.15
ork
-0.15
age
-0.15
set
-0.15
unb
-0.15
bast
-0.14
Cath
-0.14
POSITIVE LOGITS
ozor
0.16
Rings
0.16
klu
0.16
Interpolator
0.14
iyat
0.14
reamble
0.14
namoro
0.14
romium
0.14
arda
0.14
æ·
0.14
Activations Density 0.001%