INDEX
Explanations
expressions that convey judgement or assumptions about actions and situations
New Auto-Interp
Negative Logits
odyn
-0.16
ihan
-0.16
croft
-0.15
TEE
-0.15
tiv
-0.15
jom
-0.14
ombre
-0.14
zte
-0.14
worth
-0.14
asio
-0.14
POSITIVE LOGITS
bine
0.18
ç¢
0.15
oni
0.15
defaults
0.15
elez
0.15
762
0.14
avis
0.14
icz
0.14
blockSize
0.14
168
0.14
Activations Density 0.007%