INDEX
Explanations
phrases emphasizing the importance of specific conditions or situations
New Auto-Interp
Negative Logits
arella
-0.52
啲
-0.51
ufig
-0.50
اجة
-0.48
ریز
-0.47
פתח
-0.46
لاث
-0.46
Defensa
-0.45
Málaga
-0.45
إيران
-0.45
POSITIVE LOGITS
matter
1.29
regardless
1.26
Regardless
1.22
MATTER
1.22
regardless
1.20
Regardless
1.14
Matter
1.11
matter
1.08
Matter
1.04
ostante
1.02
Activations Density 0.052%