INDEX
Explanations
phrases indicating the presence or existence of content or information
New Auto-Interp
Negative Logits
Orville
-0.82
k
-0.65
nhau
-0.64
lüğ
-0.64
DPP
-0.62
userService
-0.61
@"";
-0.60
bigskip
-0.60
років
-0.58
Leona
-0.57
POSITIVE LOGITS
Contain
1.41
CONTAIN
1.41
contains
1.35
contain
1.32
Contains
1.28
contained
1.27
enthalten
1.24
Containing
1.14
contains
1.13
Contain
1.13
Activations Density 0.089%