INDEX
Explanations
mentions of different fictional worlds or settings within text
mentions of the concept of "world"
New Auto-Interp
Negative Logits
ippi
-0.95
sbm
-0.87
ifer
-0.84
oug
-0.78
etsk
-0.76
Mehran
-0.74
aturday
-0.72
odox
-0.71
++++
-0.71
actionGroup
-0.69
POSITIVE LOGITS
liness
1.12
building
1.11
wide
0.89
scape
0.89
domination
0.86
arium
0.83
view
0.83
builders
0.81
views
0.80
spawn
0.77
Activations Density 0.067%