INDEX
Explanations
references to or mentions of the word "World"
references to world events or global contexts
New Auto-Interp
Negative Logits
penalty
-0.69
charge
-0.66
resigned
-0.64
bait
-0.63
adamant
-0.63
affirmative
-0.63
paternal
-0.62
charged
-0.62
contempt
-0.60
Tanner
-0.60
POSITIVE LOGITS
World
3.91
world
2.49
World
2.15
WORLD
1.91
Worlds
1.59
world
1.49
Global
1.41
orld
1.37
Planet
1.37
WOR
1.36
Activations Density 0.013%