INDEX
Explanations
phrases indicating relationships and exchanges in collaborative contexts
the more, the more
New Auto-Interp
Negative Logits
beſch
-0.70
niſſe
-0.70
kasarigan
-0.69
mijne
-0.65
laſſen
-0.65
iſchen
-0.63
RegressionTest
-0.63
miniaturka
-0.62
zijne
-0.62
TestingModule
-0.62
POSITIVE LOGITS
base
0.35
!
0.33
Больше
0.33
nakalista
0.33
column
0.32
column
0.30
ater
0.30
Base
0.28
better
0.28
Yo
0.28
Activations Density 0.021%