INDEX
Explanations
describing subject-verb states
New Auto-Interp
Negative Logits
ंपूर्
0.44
απε
0.41
자유
0.41
Mais
0.40
͎
0.40
مباش
0.38
подклю
0.37
आख
0.37
สวน
0.36
patan
0.36
POSITIVE LOGITS
dummy
0.62
dummies
0.56
Dummy
0.53
dummy
0.52
introduction
0.51
Introduction
0.51
Introduction
0.50
existential
0.50
there
0.46
impersonal
0.46
Activations Density 0.042%