INDEX
Explanations
phrases related to entertainment and celebrity culture
New Auto-Interp
Negative Logits
aits
-0.21
olls
-0.17
itorio
-0.15
inyin
-0.14
assen
-0.14
cen
-0.14
Brains
-0.14
ighb
-0.14
nels
-0.14
ccd
-0.14
POSITIVE LOGITS
Lemon
0.15
992
0.15
bach
0.15
Levy
0.15
752
0.14
761
0.14
oui
0.14
Leopard
0.14
986
0.13
halftime
0.13
Activations Density 0.111%