INDEX
Explanations
phrases expressing admiration or preference
the frequency of the word "the" in various contexts
New Auto-Interp
Negative Logits
accordingly
-0.76
according
-0.75
strate
-0.73
SPONSORED
-0.73
ontent
-0.70
namely
-0.69
thereby
-0.69
IFA
-0.69
alion
-0.68
owned
-0.67
POSITIVE LOGITS
idea
1.43
notion
1.19
possibility
1.08
opportunity
1.02
concept
1.01
prospect
0.99
latter
0.98
fact
0.98
simplicity
0.94
finer
0.93
Activations Density 0.212%