INDEX
Explanations
names of people
references to specific individuals or entities
New Auto-Interp
Negative Logits
bailed
-0.63
reign
-0.60
nights
-0.60
succeeding
-0.60
fingerprints
-0.60
optionally
-0.56
Alleg
-0.56
heels
-0.55
scalp
-0.55
odder
-0.55
POSITIVE LOGITS
acan
1.09
é¾įå
0.87
olit
0.86
annis
0.84
acho
0.84
lda
0.80
aca
0.80
ACA
0.78
throp
0.78
heart
0.78
Activations Density 0.068%