INDEX
Explanations
singular words related to specific topics
discussions or references to various topics
New Auto-Interp
Negative Logits
ignt
-0.85
inyl
-0.69
ATES
-0.69
ardo
-0.69
ramid
-0.67
olyn
-0.65
zman
-0.63
conn
-0.63
ypes
-0.63
omp
-0.62
POSITIVE LOGITS
matter
1.14
matter
1.07
topic
1.05
topics
1.00
discussed
0.90
debated
0.84
topic
0.82
area
0.81
Matter
0.81
ivity
0.79
Activations Density 0.038%