INDEX
Explanations
adjectives expressing intensity or strength
terms related to intensity and characteristics of actions or phenomena
New Auto-Interp
Negative Logits
hoff
-0.67
sites
-0.64
Expend
-0.64
appa
-0.63
headers
-0.63
enegger
-0.62
orks
-0.61
inate
-0.61
Mech
-0.61
ilda
-0.60
POSITIVE LOGITS
resembling
1.19
reminiscent
1.16
comparable
1.11
akin
1.10
unsu
1.08
similar
1.07
indistinguishable
1.06
unimaginable
1.06
unworthy
1.00
analogous
1.00
Activations Density 0.427%