INDEX
Explanations
phrases indicating requests for help or resources
New Auto-Interp
Negative Logits
wh
-0.58
-_-
-0.52
RTLR
-0.51
autique
-0.51
strike
-0.49
Hau
-0.47
Strike
-0.46
Rohy
-0.46
strike
-0.46
PRIME
-0.46
POSITIVE LOGITS
<bos>
1.03
findpost
0.92
مرئيه
0.72
المعيارى
0.65
متحده
0.65
ⓧ
0.65
بيها
0.63
createStore
0.61
########.
0.61
mybatisplus
0.60
Activations Density 0.064%