INDEX
Explanations
references to temporal relationships or occurrences
New Auto-Interp
Negative Logits
-0.93
ロウィン
-0.90
iſen
-0.86
-0.83
ंदीखरीदारी
-0.80
ویکیپدی
-0.80
<unused43>
-0.79
Menſchen
-0.79
<unused42>
-0.78
المعيارى
-0.78
POSITIVE LOGITS
fact
0.82
way
0.73
entanto
0.67
moment
0.66
times
0.64
first
0.55
case
0.55
прочем
0.53
indeed
0.53
vezes
0.51
Activations Density 0.792%