INDEX
Explanations
fractions or percentages
phrases indicating proportions or fractions
New Auto-Interp
Negative Logits
arella
-0.70
Levin
-0.65
uyomi
-0.63
aside
-0.63
ucci
-0.61
Clash
-0.59
reintrodu
-0.58
ãĤ¶
-0.57
pedia
-0.57
Dictionary
-0.56
POSITIVE LOGITS
icial
0.70
dozen
0.69
course
0.69
ths
0.68
hops
0.64
GDP
0.64
perties
0.64
Sevent
0.63
respondents
0.62
Initialized
0.61
Activations Density 0.088%