INDEX
Explanations
verbs related to observation or visual perception
New Auto-Interp
Negative Logits
она
-0.15
ãĤ¯ãĤ·ãĥ§ãĥ³
-0.15
ogen
-0.14
audi
-0.14
presso
-0.14
obus
-0.13
oplevel
-0.13
zzle
-0.13
تز
-0.13
ama
-0.13
POSITIVE LOGITS
ijing
0.16
ago
0.15
brush
0.15
obl
0.14
ELLOW
0.14
CID
0.14
irse
0.14
ed
0.14
AMS
0.14
Honolulu
0.14
Activations Density 0.023%