INDEX
Explanations
proper nouns or specific names
pronouns and references to people or groups
New Auto-Interp
Negative Logits
respectively
-0.60
..."
-0.50
solely
-0.49
reef
-0.48
Skydragon
-0.48
nect
-0.47
whichever
-0.47
basket
-0.47
ÏĦ
-0.47
decimal
-0.46
POSITIVE LOGITS
resa
1.05
odore
1.04
romeda
0.89
notations
0.87
mosp
0.79
withstanding
0.78
etheless
0.78
xiety
0.78
swers
0.73
nsic
0.72
Activations Density 0.886%