INDEX
Explanations
expressions of appreciation or compliments
New Auto-Interp
Negative Logits
}}$}
-0.92
myſelf
-0.85
ſelf
-0.81
cherchés
-0.77
.[/
-0.76
***/
-0.75
دانشنامهٔ
-0.74
―――――
-0.72
itſelf
-0.71
!")
-0.71
POSITIVE LOGITS
<eos>
0.69
I
0.63
it
0.63
//
0.60
podar
0.60
Go
0.59
The
0.56
To
0.56
Do
0.55
i
0.54
Activations Density 0.151%