INDEX
Explanations
references to numbers and their significance in various contexts
New Auto-Interp
Negative Logits
uter
-0.17
atcher
-0.15
rew
-0.15
urg
-0.15
atter
-0.14
ampion
-0.14
yntax
-0.14
_RD
-0.14
heimer
-0.14
344
-0.14
POSITIVE LOGITS
Reasons
0.23
Degrees
0.19
Ways
0.19
reasons
0.19
Questions
0.18
Steps
0.18
Faces
0.18
Reason
0.18
Minutes
0.17
Cent
0.17
Activations Density 0.100%