INDEX
Explanations
quotes and dialogues within the text
New Auto-Interp
Negative Logits
BELOW
-0.14
Warning
-0.14
WARNING
-0.14
Known
-0.14
лиÑĪÑĮ
-0.14
WARNING
-0.14
NOTE
-0.14
nerg
-0.14
NOTE
-0.13
æŃ¤
-0.13
POSITIVE LOGITS
somebody
0.24
definitely
0.23
certainly
0.23
We
0.23
really
0.22
[
0.21
everybody
0.21
I
0.19
particularly
0.19
Somebody
0.19
Activations Density 0.103%