INDEX
Explanations
expressions related to personal experiences and opinions
New Auto-Interp
Negative Logits
themselves
-0.68
กัน
-0.66
us
-0.58
themselves
-0.56
Portail
-0.54
)}}{-0.51
-0.50
idespread
-0.48
bedoeld
-0.48
nhau
-0.47
POSITIVE LOGITS
myself
1.33
myself
1.05
Myself
1.05
personally
0.92
myſelf
0.83
my
0.77
meus
0.74
persönlich
0.71
moje
0.71
mijn
0.70
Activations Density 0.384%