INDEX
Explanations
phrases indicating social events and community involvement
New Auto-Interp
Negative Logits
ocker
-0.13
McCabe
-0.13
ê¸ī
-0.13
hee
-0.12
моз
-0.12
abez
-0.12
каж
-0.12
wake
-0.12
inesis
-0.12
471
-0.12
POSITIVE LOGITS
present
0.24
Guests
0.23
guests
0.23
present
0.21
speakers
0.21
Guest
0.21
speeches
0.20
guest
0.20
Present
0.20
_present
0.20
Activations Density 0.145%