INDEX
Explanations
specific technical jargon related to scientific concepts
New Auto-Interp
Negative Logits
ëĭ¤ëĬĶ
-0.17
çłģ
-0.16
forman
-0.15
ivate
-0.15
ions
-0.15
inition
-0.15
ÑĪи
-0.15
flater
-0.14
imately
-0.14
jspx
-0.14
POSITIVE LOGITS
er
0.65
e
0.57
a
0.55
y
0.52
o
0.50
i
0.45
ÙĬ
0.39
ÛĮ
0.38
an
0.36
ing
0.33
Activations Density 1.877%