INDEX
Explanations
possessive forms indicating identity or relationship
New Auto-Interp
Negative Logits
uates
-0.83
inges
-0.78
forts
-0.72
confines
-0.71
abba
-0.69
%%%%
-0.69
obo
-0.68
awks
-0.68
lehem
-0.67
urches
-0.65
POSITIVE LOGITS
been
1.28
gotten
1.20
been
1.05
grown
0.98
gone
0.94
fallen
0.94
done
0.92
undergone
0.90
Been
0.87
withdrawn
0.87
Activations Density 0.076%