INDEX
Explanations
words related to physical spaces or locations
references to physical or conceptual spaces
New Auto-Interp
Negative Logits
vengeance
-0.73
Turk
-0.68
IVERS
-0.66
Rupert
-0.64
ocide
-0.63
blackmail
-0.62
neurot
-0.62
POV
-0.62
dism
-0.62
ctive
-0.60
POSITIVE LOGITS
spaces
1.20
hips
1.18
Spaces
0.99
hops
0.96
uits
0.95
paces
0.91
pace
0.87
dayName
0.86
lot
0.85
poons
0.85
Activations Density 0.015%