INDEX
Explanations
phrases related to scientific methodology and process steps in experiments
New Auto-Interp
Negative Logits
even
-0.74
yet
-0.66
zelfs
-0.62
even
-0.61
Even
-0.61
Даже
-0.59
chiar
-0.59
certainly
-0.58
Incluso
-0.58
nawet
-0.57
POSITIVE LOGITS
Briefly
0.85
ابتدا
0.75
Eighteen
0.74
Twenty
0.74
Thirty
0.73
まず
0.73
Twelve
0.73
sixty
0.72
Forty
0.71
RESULTS
0.71
Activations Density 1.183%