INDEX
Explanations
phrases indicating the cessation or end of a state or condition
New Auto-Interp
Negative Logits
Himo
-0.68
intptr
-0.65
-0.65
ddelweddau
-0.64
ویکیپدیا
-0.64
RegressionTest
-0.63
väg
-0.63
ظيم
-0.63
Numerade
-0.63
iettivo
-0.62
POSITIVE LOGITS
longer
0.80
不再
0.76
longer
0.75
Unters
0.71
onOptions
0.71
länger
0.69
Longer
0.66
Beg
0.65
alagi
0.63
llog
0.60
Activations Density 0.082%