INDEX
Explanations
words related to limitations or restrictions
concepts related to limitations or restrictions
New Auto-Interp
Negative Logits
story
-0.79
estone
-0.74
ston
-0.73
Honour
-0.72
joy
-0.71
tein
-0.70
ocaust
-0.70
jamin
-0.68
Naz
-0.68
clamation
-0.64
POSITIVE LOGITS
constraints
1.38
constraint
1.29
constrained
1.11
restraints
1.05
imposed
0.96
incent
0.92
besie
0.88
cooker
0.87
pressures
0.86
dictates
0.85
Activations Density 0.008%