INDEX
Explanations
concepts related to scientific principles and axioms
New Auto-Interp
Negative Logits
Elk
-0.15
ran
-0.15
ɵ
-0.14
_targets
-0.14
à¸Ĥà¸Ńà¸ĩà¸ľ
-0.14
esp
-0.13
dek
-0.13
stereotype
-0.13
æŁ³
-0.13
ëħĦëıĦë³Ħ
-0.13
POSITIVE LOGITS
Laws
0.20
kowski
0.18
laws
0.17
fals
0.16
laws
0.16
opper
0.16
Science
0.16
erah
0.16
Reality
0.16
science
0.15
Activations Density 0.353%