INDEX
Explanations
words related to knowing a state, level, or how much you know
conditions
New Auto-Interp
Negative Logits
where
-0.92
-0.79
where
-0.78
.
-0.70
Where
-0.68
↵↵
-0.67
that
-0.67
the
-0.64
<eos>
-0.63
a
-0.62
POSITIVE LOGITS
للمعارف
1.31
المعيارى
1.26
للاسماء
1.25
مرئيه
1.23
Normdatei
1.22
utafitiHapana
1.21
otomatig
1.20
expandindo
1.17
новниш
1.16
UrlResolution
1.14
Activations Density 0.900%