INDEX
Explanations
the presence of application-related terminology or structure in the text
New Auto-Interp
Negative Logits
品
-0.47
val
-0.47
a
-0.47
more
-0.47
ámos
-0.46
пак
-0.45
they
-0.45
Do
-0.44
T
-0.44
its
-0.42
POSITIVE LOGITS
Hentet
1.04
Normdatei
0.80
Scénario
0.79
estekak
0.76
OFDb
0.75
NDEBUG
0.72
Superhost
0.70
ostavi
0.70
ⓧ
0.70
ویکیپدی
0.69
Activations Density 0.206%