INDEX
Explanations
instances of the word "access" and related words like "clients" and "administrative"
New Auto-Interp
Negative Logits
Efq
-1.27
<bos>
-1.05
Monfieur
-1.02
myſelf
-1.01
ſche
-0.98
Theſe
-0.97
Majefty
-0.95
ſelves
-0.94
Shakspeare
-0.94
itſelf
-0.94
POSITIVE LOGITS
)))));
0.75
bootstrapcdn
0.68
+");
0.67
.*")]
0.66
so
0.65
0.65
)]$
0.64
))$.
0.63
)."
0.62
))$
0.62
Activations Density 0.899%