INDEX
Explanations
assertions about personal beliefs and experiences
personal states and actions
New Auto-Interp
Negative Logits
themſelves
-0.77
itſelf
-0.65
themselves
-0.61
فريبيس
-0.61
himſelf
-0.56
ainfi
-0.54
日閲覧
-0.51
themselves
-0.50
他們的
-0.50
ectoria
-0.49
POSITIVE LOGITS
myself
0.76
myself
0.61
my
0.60
Myself
0.58
I
0.53
meinem
0.49
Myself
0.48
meine
0.48
minhas
0.48
meinen
0.46
Activations Density 0.088%