INDEX
Explanations
special characters or symbols in the text
New Auto-Interp
Negative Logits
àµįà´
-0.17
/thumb
-0.16
££
-0.14
àµ
-0.14
,**
-0.14
âĢĤ
-0.14
â
-0.14

-0.14
oje
-0.14
Ô
-0.13
POSITIVE LOGITS
âĶĢâĶĢ
0.32
âķ
0.28
âĶ
0.26
âĶ
0.24
âĶĢ
0.22
âķ
0.22
âĶľâĶĢâĶĢ
0.22
âĶĢâĶĢâĶĢâĶĢ
0.21
âͬ
0.21
âķIJ
0.20
Activations Density 0.001%