INDEX
Explanations
proper nouns related to individuals
variations of the name "Darryl" and other similar names
New Auto-Interp
Negative Logits
cffff
-0.70
metaphor
-0.66
compromise
-0.65
tipping
-0.62
respons
-0.61
obstruction
-0.60
pen
-0.60
izoph
-0.59
amnesty
-0.59
exc
-0.59
POSITIVE LOGITS
nda
0.97
tta
0.88
von
0.88
lene
0.87
lla
0.86
enne
0.86
xia
0.85
hea
0.81
ndra
0.80
ela
0.80
Activations Density 0.075%