INDEX
Explanations
single-numbered items in lists
references to numbered lists or items
New Auto-Interp
Negative Logits
scenes
-0.70
allowed
-0.66
espie
-0.66
leading
-0.65
fences
-0.64
bent
-0.64
apon
-0.63
awaru
-0.62
milit
-0.62
thinkable
-0.61
POSITIVE LOGITS
Password
1.29
Corinthians
1.12
st
1.07
125
1.06
120
1.02
½
0.95
000000
0.88
123
0.88
128
0.85
RM
0.80
Activations Density 0.056%