INDEX
Explanations
words indicative of theoretical discussions or reviews in academic contexts
starting or beginning
we start, begin, proceed
New Auto-Interp
Negative Logits
another
-0.53
another
-0.52
weiteren
-0.52
weitere
-0.51
Another
-0.51
Further
-0.50
ytterligare
-0.50
autre
-0.50
weiterer
-0.50
還能
-0.49
POSITIVE LOGITS
starts
2.39
start
2.29
starting
2.19
begin
2.08
begins
2.05
Start
2.05
Starting
2.04
Starts
2.03
started
2.01
dimulai
1.99
Activations Density 0.469%