INDEX
Explanations
phrases related to isolation or being out of touch with reality
New Auto-Interp
Negative Logits
igte
-0.16
èĩ
-0.15
oui
-0.15
èĩ
-0.14
edm
-0.14
ardin
-0.14
ardu
-0.14
endale
-0.14
arden
-0.14
Seam
-0.14
POSITIVE LOGITS
.scalablytyped
0.18
Ь
0.17
Ĥæķ°
0.16
_dll
0.15
ιÏĥÏĦο
0.15
pus
0.14
ÏĨοÏģ
0.14
lul
0.14
&r
0.14
isify
0.14
Activations Density 0.005%