INDEX
Explanations
specific time and date references in the text
New Auto-Interp
Negative Logits
PLICIT
-0.15
mania
-0.15
etwork
-0.15
dings
-0.15
OfWork
-0.14
din
-0.14
543
-0.14
ystack
-0.14
izard
-0.14
wr
-0.14
POSITIVE LOGITS
uards
0.16
िड
0.15
ivec
0.15
alie
0.15
alc
0.14
ij
0.14
ceb
0.13
sey
0.13
distancia
0.13
oy
0.13
Activations Density 0.105%