INDEX
Explanations
mathematical operations and variables used in expressions or equations
New Auto-Interp
Negative Logits
+</
-0.50
him
-0.47
tight
-0.41
我也是
-0.41
jeg
-0.41
kel
-0.41
ÄT
-0.40
urop
-0.40
occa
-0.40
Seek
-0.39
POSITIVE LOGITS
pleaſure
0.99
raiſ
0.93
cauſe
0.92
ſeveral
0.90
ſmall
0.88
uſed
0.88
Theſe
0.86
againſt
0.84
reaſon
0.84
itſelf
0.84
Activations Density 0.011%