INDEX
Explanations
words related to discussions or issues being talked about
New Auto-Interp
Negative Logits
ardo
-0.74
olyn
-0.71
ramid
-0.70
ignt
-0.70
omp
-0.67
inyl
-0.65
oÄŁ
-0.64
Dim
-0.63
Yates
-0.63
ATES
-0.62
POSITIVE LOGITS
matter
1.08
topics
1.03
topic
0.93
discussed
0.86
debated
0.84
matter
0.83
discussion
0.81
Topics
0.81
ussed
0.79
ivities
0.79
Activations Density 0.068%