INDEX
Explanations
phrases related to confinement or restriction
phrases related to confinement and restrictions
New Auto-Interp
Negative Logits
Authors
-0.70
llo
-0.68
alis
-0.68
ulia
-0.68
opoulos
-0.66
choir
-0.65
eli
-0.64
orks
-0.62
itia
-0.61
uproar
-0.61
POSITIVE LOGITS
Borders
0.84
hold
0.78
leash
0.76
loader
0.75
posts
0.71
borders
0.68
rency
0.68
ainers
0.67
desks
0.65
road
0.65
Activations Density 0.132%