INDEX
Explanations
specific references or terms related to scientific or technical contexts
New Auto-Interp
Negative Logits
[…]
-0.64
\\
-0.54
-0.48
\
-0.47
مصادر
-0.46
ciaż
-0.46
↵
-0.45
cektir
-0.45
[toxicity=0]
-0.44
getElement
-0.43
POSITIVE LOGITS
تقاوى
1.30
tagHelperRunner
1.30
AssemblyCompany
1.15
LookAnd
1.06
featureID
1.03
InjectAttribute
1.02
AssemblyCulture
0.99
typelib
0.98
GEBURTSDATUM
0.98
ویکیپدی
0.94
Activations Density 13.891%