INDEX
Explanations
breakdown followed by categorization
New Auto-Interp
Negative Logits
implementar
0.64
מה
0.63
відповідно
0.63
pourtant
0.62
tasked
0.61
implement
0.60
плану
0.60
solides
0.59
shrink
0.59
complex
0.59
POSITIVE LOGITS
categor
1.26
category
1.23
categorized
1.23
categor
1.16
Categor
1.14
classified
1.07
categorie
1.07
categorization
1.06
CATEG
1.05
Categor
1.04
Activations Density 0.303%