INDEX
Explanations
prepositions and related phrases indicating relationships or connections in text
by names or initials
New Auto-Interp
Negative Logits
hib
-0.35
-
-0.33
tas
-0.32
UIAlert
-0.32
critical
-0.32
party
-0.30
مفت
-0.30
<eos>
-0.30
ver
-0.30
FR
-0.29
POSITIVE LOGITS
✨:
0.98
fjspx
0.93
Италијани
0.90
[@BOS@]
0.87
<unused43>
0.87
<unused41>
0.87
<unused47>
0.87
<unused28>
0.87
<unused8>
0.87
<unused14>
0.87
Activations Density 0.007%