INDEX
Explanations
mentions of locations or environments that are external to a specified context
mentions of the word "outside."
New Auto-Interp
Negative Logits
ulous
-0.78
anche
-0.78
ander
-0.72
vous
-0.71
onder
-0.69
antine
-0.69
ulously
-0.69
oka
-0.68
ery
-0.68
ency
-0.67
POSITIVE LOGITS
Borders
0.75
bounds
0.74
jurisdictions
0.72
Antarctica
0.71
observer
0.68
side
0.67
worlds
0.67
shock
0.67
academia
0.66
world
0.65
Activations Density 0.034%