INDEX
Explanations
phrases that start with "about," indicating a focus on topics of discussion or inquiry
New Auto-Interp
Negative Logits
izable
-0.17
fault
-0.16
ities
-0.15
.scalablytyped
-0.15
argout
-0.14
и
-0.13
ัà¸ļม
-0.13
heim
-0.13
rot
-0.13
ãģ¨ãģĹãģ¦
-0.13
POSITIVE LOGITS
/from
0.21
-NLS
0.20
طرÙĬÙĤ
0.20
-face
0.17
/to
0.17
lying
0.17
avia
0.16
/by
0.16
(predicate
0.16
how
0.16
Activations Density 0.147%