INDEX
Explanations
words related to mathematical variables and operations
New Auto-Interp
Negative Logits
previa
-0.91
Ceramby
-0.90
שוליים
-0.79
#+#
-0.77
Gente
-0.75
Sila
-0.74
Dooley
-0.74
Lerner
-0.73
Ferrell
-0.72
Monkey
-0.70
POSITIVE LOGITS
d
1.41
D
1.28
D
1.20
d
1.20
getD
1.10
d
0.91
д
0.90
Dd
0.86
PhysRevD
0.82
د
0.81
Activations Density 0.223%