INDEX
Explanations
abstract concepts and topics
New Auto-Interp
Negative Logits
vero
-0.09
forth
-0.09
729
-0.09
оÑĤли
-0.09
predic
-0.09
clare
-0.08
imon
-0.08
Lazar
-0.08
afil
-0.08
Alv
-0.08
POSITIVE LOGITS
matter
0.25
topic
0.24
matter
0.19
subject
0.18
topics
0.17
topic
0.16
Matter
0.16
matters
0.16
SUBJECT
0.15
subject
0.15
Activations Density 0.067%