INDEX
Explanations
words related to hypocrisy and high blood pressure
New Auto-Interp
Negative Logits
bury
-0.15
âĦĸ
-0.14
quest
-0.14
trap
-0.14
ifold
-0.14
Mir
-0.14
uely
-0.13
Gel
-0.13
blind
-0.13
alf
-0.13
POSITIVE LOGITS
hyp
0.22
ervisor
0.21
Hyp
0.21
ocrisy
0.20
hypo
0.19
thesize
0.19
undai
0.19
nosis
0.18
гип
0.18
thesized
0.17
Activations Density 0.012%