INDEX
Explanations
adjectives and participles that describe characteristics or qualities
New Auto-Interp
Negative Logits
ust
-0.15
rike
-0.15
zioni
-0.14
rij
-0.14
alytics
-0.14
iginal
-0.14
*)_
-0.14
å«
-0.13
Strikes
-0.13
eof
-0.13
POSITIVE LOGITS
upo
0.17
etically
0.16
ly
0.15
?
0.15
ically
0.15
istically
0.14
uably
0.14
emente
0.14
izing
0.14
ELY
0.14
Activations Density 0.247%