INDEX
Explanations
the word "simple" and its variants
simple and straightforward language
New Auto-Interp
Negative Logits
vance
-0.79
largeDownload
-0.75
hovah
-0.75
extensively
-0.74
reon
-0.70
ibal
-0.69
ashington
-0.69
haw
-0.69
vigorously
-0.68
raints
-0.67
POSITIVE LOGITS
tons
1.42
minded
0.98
ton
0.97
minded
0.97
arithmetic
0.94
syrup
0.93
explanation
0.84
wallet
0.82
ified
0.81
elegance
0.80
Activations Density 0.044%