INDEX
Explanations
elements related to sentencing and language structure
New Auto-Interp
Negative Logits
Warburton
-0.81
первых
-0.73
dolu
-0.73
ofire
-0.71
חיצוניים
-0.70
icoot
-0.69
chaikovsky
-0.67
Warna
-0.66
nicio
-0.66
AutoScaleMode
-0.65
POSITIVE LOGITS
sentences
1.59
sentence
1.54
Sentence
1.51
Sentences
1.36
sentences
1.31
Sentence
1.27
sentence
1.22
sentenced
1.03
frase
0.99
Cune
0.94
Activations Density 0.136%