INDEX
Explanations
specific biological or physiological processes and their components
Word after articles ("a", "an", "the")
articles followed by nouns
New Auto-Interp
Negative Logits
pri
-0.30
quality
-0.28
Level
-0.26
monter
-0.26
and
-0.26
,
-0.25
compare
-0.25
level
-0.25
“
-0.24
recon
-0.24
POSITIVE LOGITS
المعيارى
0.84
Dieſe
0.74
<unused47>
0.73
ویکیپدی
0.73
<unused16>
0.73
<unused3>
0.73
<unused23>
0.73
<unused41>
0.73
<unused14>
0.73
<unused8>
0.73
Activations Density 0.411%