INDEX
Explanations
words related to instructions, specific actions, or elements of a document
words related to annotations and their role in information presentation
New Auto-Interp
Negative Logits
pedigree
-0.66
avez
-0.64
nas
-0.63
porting
-0.62
wild
-0.61
friends
-0.60
bury
-0.60
iaz
-0.60
library
-0.60
bred
-0.60
POSITIVE LOGITS
TION
1.24
OGR
0.84
luster
0.81
]}
0.80
acular
0.78
...]
0.77
Frieza
0.77
odied
0.74
pmwiki
0.72
SIZE
0.71
Activations Density 0.013%