INDEX
Explanations
instances of 'or' and other alternatives in the text
New Auto-Interp
Negative Logits
/or
-0.21
swer
-0.19
нÑİ
-0.18
empo
-0.18
venir
-0.17
ì¹
-0.16
emas
-0.15
seys
-0.15
-0.15
/OR
-0.15
POSITIVE LOGITS
ignal
0.28
Bust
0.27
ourke
0.26
bust
0.24
anged
0.24
anging
0.24
phans
0.23
acles
0.23
ator
0.22
angen
0.21
Activations Density 0.143%