INDEX
Explanations
words related to actions and interactions between people or about specific situations
specific non-lexical characters or symbols used in dialogue or narration
New Auto-Interp
Negative Logits
usefulness
-0.86
decentral
-0.85
merits
-0.83
ecosystem
-0.82
democracies
-0.82
benefits
-0.81
partnerships
-0.81
neutrality
-0.81
sustainability
-0.80
diversity
-0.80
POSITIVE LOGITS
then
1.18
she
1.10
him
1.05
later
1.04
his
1.02
Then
1.02
Later
0.98
laugh
0.97
Suddenly
0.97
said
0.93
Activations Density 0.343%