INDEX
Explanations
phrases related to authority or importance
frequent use of the article "the."
New Auto-Interp
Negative Logits
âĢł
-0.80
cum
-0.76
ãĤ´ãĥ³
-0.75
udder
-0.69
:-
-0.68
nonetheless
-0.68
.(
-0.68
ãĥ»
-0.67
thereby
-0.67
.-
-0.67
POSITIVE LOGITS
whole
1.08
totality
1.01
biggest
1.00
hardest
0.98
toughest
0.97
guy
0.96
[
0.96
oret
0.93
greatest
0.93
slightest
0.92
Activations Density 0.492%