INDEX
Explanations
proper nouns related to popular culture and current events
New Auto-Interp
Negative Logits
Ò
-0.67
because
-0.67
depending
-0.65
̶
-0.62
yond
-0.61
every
-0.60
respectively
-0.60
$.
-0.60
Els
-0.59
many
-0.58
POSITIVE LOGITS
Profile
1.17
âĵĺ
0.95
Overview
0.91
welcomes
0.90
Gets
0.90
celebrates
0.86
Quote
0.85
Facts
0.85
Releases
0.85
Says
0.83
Activations Density 2.895%