INDEX
Explanations
prepositional phrases related to locations or connections between entities
New Auto-Interp
Negative Logits
sbm
-0.70
andowski
-0.70
pring
-0.65
yg
-0.65
zin
-0.64
aryn
-0.64
tri
-0.62
pair
-0.62
DragonMagazine
-0.62
congr
-0.62
POSITIVE LOGITS
its
0.91
theirs
0.87
pload
0.77
their
0.71
Cuba
0.70
rubble
0.68
orbit
0.67
Havana
0.65
Himself
0.65
itself
0.64
Activations Density 0.270%