INDEX
Explanations
specific nouns and verbs associated with cultural or artistic references
New Auto-Interp
Negative Logits
unable
-0.15
scal
-0.14
abilities
-0.14
ilities
-0.14
inh
-0.14
tiv
-0.14
fed
-0.13
ÑĤим
-0.13
hos
-0.13
ereco
-0.13
POSITIVE LOGITS
GENCY
0.15
AGR
0.14
quirrel
0.14
ìĹĦ
0.13
riday
0.13
Summers
0.13
ku
0.13
draft
0.13
endale
0.13
ма
0.13
Activations Density 0.479%