INDEX
Explanations
phrases that express the concept of non-existence or impossibility
New Auto-Interp
Negative Logits
egrator
-0.16
isko
-0.16
oload
-0.16
assage
-0.14
utters
-0.14
ivel
-0.13
illis
-0.13
ëªħìĿĦ
-0.13
chang
-0.12
acional
-0.12
POSITIVE LOGITS
thing
1.38
thing
1.09
Thing
1.07
Thing
0.98
cosa
0.79
things
0.70
THING
0.70
coisa
0.70
(thing
0.69
things
0.60
Activations Density 0.102%