INDEX
Explanations
terminology related to axioms and their variations in scientific context
New Auto-Interp
Negative Logits
亮
-0.19
occo
-0.18
acey
-0.18
ancock
-0.17
uffs
-0.17
erset
-0.17
aste
-0.16
ajar
-0.16
utschen
-0.15
askell
-0.15
POSITIVE LOGITS
ioms
0.32
illary
0.31
ially
0.28
minster
0.27
cess
0.25
onal
0.24
elsen
0.23
les
0.21
pert
0.21
eman
0.21
Activations Density 0.012%