INDEX
Explanations
instances of subjective opinions and interpretations related to knowledge and understanding
New Auto-Interp
Negative Logits
oi
-0.15
أجÙĦ
-0.14
instead
-0.13
ulia
-0.13
èªį
-0.13
xies
-0.13
è¯ij
-0.13
Äĵ
-0.13
NCY
-0.13
ledi
-0.12
POSITIVE LOGITS
looking
0.47
based
0.40
looking
0.39
reading
0.39
Based
0.35
Looking
0.33
Looking
0.33
Based
0.33
based
0.32
reading
0.32
Activations Density 0.330%