INDEX
Explanations
proper names
the start or end of a document
New Auto-Interp
Negative Logits
Chaser
-0.71
wiser
-0.61
assum
-0.60
uphill
-0.58
destro
-0.56
buggy
-0.55
Cinderella
-0.54
cheat
-0.54
cyn
-0.54
Gemini
-0.53
POSITIVE LOGITS
erville
0.66
heng
0.65
;;;;;;;;;;;;
0.64
Refugees
0.63
thood
0.62
ocaust
0.62
iery
0.61
ATIONAL
0.61
ILCS
0.60
Caption
0.60
Activations Density 0.219%