INDEX
Explanations
expressions of excitement and sharing personal updates
New Auto-Interp
Negative Logits
airo
-0.17
ucas
-0.15
ainer
-0.14
zel
-0.14
ateg
-0.14
Gel
-0.13
ird
-0.13
ucker
-0.13
aneous
-0.13
eper
-0.13
POSITIVE LOGITS
everyone
0.51
everybody
0.48
Everyone
0.44
everyone
0.42
y
0.42
Everyone
0.41
大家
0.40
Everybody
0.39
Everybody
0.37
anyone
0.31
Activations Density 0.198%