INDEX
Explanations
phrases related to readability and the act of reading
New Auto-Interp
Negative Logits
igner
-0.18
ug
-0.16
usc
-0.15
ster
-0.15
ye
-0.15
243
-0.14
cf
-0.14
flavors
-0.14
give
-0.14
ad
-0.14
POSITIVE LOGITS
bourg
0.18
ults
0.17
alie
0.17
ÑĤÑĢо
0.16
.mit
0.16
logen
0.15
achte
0.15
atform
0.15
_deinit
0.15
fisse
0.14
Activations Density 0.024%