INDEX
Explanations
specific articles and prepositions indicating common subjects or actions
New Auto-Interp
Negative Logits
conformidad
-0.39
begitu
-0.38
lendemain
-0.37
antaranya
-0.37
faudrait
-0.36
antemano
-0.36
tournage
-0.36
antaranya
-0.35
Komunikasi
-0.35
accompagnement
-0.35
POSITIVE LOGITS
⟬
0.65
脚注の使い方
0.62
パンチラ
0.60
<unused79>
0.59
<unused8>
0.59
[@BOS@]
0.59
<unused41>
0.59
<unused42>
0.59
<unused28>
0.59
<unused14>
0.59
Activations Density 0.122%