INDEX
Explanations
phrases that express uncertainty or subjective opinions
New Auto-Interp
Negative Logits
OGND
-1.21
цездатний
-1.02
ImageContext
-0.99
ItemBackground
-0.96
Vidite
-0.94
wireType
-0.93
Wikimedijinoj
-0.93
intptr
-0.93
LookAnd
-0.90
الحره
-0.90
POSITIVE LOGITS
.
0.54
until
0.46
p
0.45
or
0.42
v
0.42
fer
0.42
until
0.41
Until
0.41
)
0.40
ε
0.39
Activations Density 0.180%