INDEX
Explanations
possessive pronouns indicating ownership or relation
New Auto-Interp
Negative Logits
619
-0.17
ög
-0.15
ãĥĨãĥ«
-0.13
uilder
-0.13
313
-0.13
Ãĸr
-0.13
OMEM
-0.13
дÑĢ
-0.13
ÙĬع
-0.13
æŁĶ
-0.13
POSITIVE LOGITS
ability
0.21
role
0.19
гÑĥ
0.17
á»ĭ
0.17
attempt
0.16
tendency
0.16
attempts
0.15
ÑģпоÑģоб
0.15
inability
0.15
handling
0.15
Activations Density 0.169%