INDEX
Explanations
names or proper nouns
proper nouns, particularly names, throughout the document
New Auto-Interp
Negative Logits
minecraft
-0.92
kefeller
-0.72
zsche
-0.70
ftime
-0.70
atform
-0.69
ĸļ
-0.68
isphere
-0.66
ibaba
-0.64
inki
-0.63
itionally
-0.63
POSITIVE LOGITS
vez
0.84
abbage
0.71
Sins
0.69
neau
0.69
Gaul
0.69
ij士
0.66
Administ
0.64
Maced
0.64
Detail
0.63
Esc
0.61
Activations Density 0.517%