INDEX
Explanations
expressions related to personal feelings and thoughts
New Auto-Interp
Negative Logits
―――――
-0.96
itſelf
-0.94
doubtnut
-0.91
་་
-0.90
Majefty
-0.89
ſelf
-0.85
)";
-0.85
myſelf
-0.84
Anſ
-0.81
Jefus
-0.78
POSITIVE LOGITS
mara
0.57
Nowadays
0.54
thing
0.54
nice
0.53
it
0.51
advices
0.51
everybody
0.50
somebody
0.50
!
0.50
ON
0.49
Activations Density 0.882%