INDEX
Explanations
words related to academic titles and matters of importance or urgency
New Auto-Interp
Negative Logits
brav
-0.68
conversion
-0.67
plat
-0.65
punch
-0.65
shelves
-0.65
chast
-0.65
sarc
-0.63
redes
-0.63
blast
-0.63
delight
-0.62
POSITIVE LOGITS
ents
1.46
ential
1.41
ency
1.40
encies
1.37
ences
1.37
entials
1.28
iencies
1.28
ited
1.25
ently
1.25
itors
1.25
Activations Density 0.194%