INDEX
Explanations
Greek words indicating actions or descriptions
New Auto-Interp
Negative Logits
vutta
0.25
tathapi
0.25
Når
0.25
và
0.25
vadati
0.24
tasmim
0.24
Қ
0.24
vasena
0.24
Dacă
0.24
natthi
0.23
POSITIVE LOGITS
α
0.28
σ
0.24
β
0.23
ε
0.23
θ
0.23
εξ
0.23
sp
0.22
δ
0.22
ω
0.21
δια
0.21
Activations Density 0.001%