INDEX
Explanations
various types of ratios and their implications in different contexts
New Auto-Interp
Negative Logits
gotten
-0.76
sylvania
-0.69
dream
-0.69
-0.67
Ic
-0.65
REAM
-0.65
uggage
-0.64
asper
-0.64
concess
-0.63
ARK
-0.62
POSITIVE LOGITS
xual
1.04
ratios
0.96
ratio
0.87
yip
0.83
ically
0.80
nces
0.79
mismatch
0.75
ical
0.74
ally
0.73
numer
0.72
Activations Density 0.006%