INDEX
Explanations
punctuations and specific grammatical structures in sentences
actions and their consequences
New Auto-Interp
Negative Logits
EndGlobalSection
-0.34
stanovnika
-0.34
nahilalakip
-0.32
juos
-0.31
kennis
-0.28
basicConfig
-0.28
IVEREF
-0.28
issenschaft
-0.28
preventiva
-0.27
kvalitet
-0.27
POSITIVE LOGITS
препратки
0.71
فريبيس
0.63
ſſung
0.59
0.59
<unused16>
0.59
<unused47>
0.58
клопе
0.58
<unused42>
0.58
<unused3>
0.58
<unused23>
0.58
Activations Density 0.051%