INDEX
Explanations
current from food or electronics
New Auto-Interp
Negative Logits
botón
0.54
acero
0.49
takım
0.47
חנו
0.46
méthode
0.46
回転
0.45
푼
0.45
Tahun
0.45
všetky
0.45
estándar
0.44
POSITIVE LOGITS
i
0.57
nat
0.56
Nat
0.50
uti
0.47
Scandinavian
0.46
призна
0.46
eres
0.45
ira
0.44
nn
0.44
도
0.44
Activations Density 0.003%