INDEX
Explanations
entities, verbs, and actions related to diverse topics such as caring, fighting, authority, laughing, giving access, trends in doll sales, scores in sports, paying for services, educational success rates, and more
phrases related to social and political concerns
New Auto-Interp
Negative Logits
meet
-0.81
Annotations
-0.69
sense
-0.65
isconsin
-0.64
payers
-0.63
wow
-0.63
oster
-0.62
asta
-0.61
iber
-0.61
idth
-0.60
POSITIVE LOGITS
theirs
0.96
hers
0.95
likewise
0.91
Ĥİ
0.64
icter
0.64
similarly
0.63
meanwhile
0.62
oday
0.60
Cinema
0.59
Fem
0.57
Activations Density 0.567%