INDEX
Explanations
phrases that express the idea of non-existence or absolute conditions
New Auto-Interp
Negative Logits
lix
-0.15
Pride
-0.14
arten
-0.14
xdb
-0.14
tracker
-0.14
awa
-0.14
challenge
-0.14
kich
-0.14
diam
-0.13
RTC
-0.13
POSITIVE LOGITS
agged
0.16
Fry
0.14
ound
0.14
ırak
0.13
ahl
0.13
haf
0.13
iem
0.13
ChildIndex
0.13
536
0.13
rosso
0.13
Activations Density 0.070%