INDEX
Explanations
"that" followed by conjunction
New Auto-Interp
Negative Logits
kes
0.47
Kes
0.45
T
0.43
Kes
0.42
Lup
0.41
פּ
0.41
F
0.41
Fl
0.40
زالة
0.39
Gre
0.38
POSITIVE LOGITS
ouwd
0.42
ologisk
0.41
piratory
0.41
ücksicht
0.41
القيمه
0.40
supervision
0.38
supervised
0.38
ialog
0.38
價值
0.38
hervor
0.38
Activations Density 0.000%