INDEX
Explanations
describing qualities or degree
New Auto-Interp
Negative Logits
overridden
0.80
impossible
0.80
interstellar
0.79
وهي
0.78
entirely
0.77
najwięks
0.77
höchsten
0.77
ultimate
0.75
testament
0.75
Major
0.74
POSITIVE LOGITS
adequate
0.81
vigorous
0.79
精致
0.79
suaves
0.77
積極
0.77
厰
0.77
rigorous
0.76
充実
0.76
鮮
0.75
coher
0.74
Activations Density 0.133%