INDEX
Explanations
adjectives related to important and complex topics
New Auto-Interp
Negative Logits
trained
-0.80
Cola
-0.77
blers
-0.72
ressor
-0.67
stores
-0.65
mia
-0.63
esters
-0.63
bil
-0.63
VERT
-0.63
ambo
-0.62
POSITIVE LOGITS
topic
2.00
topics
1.88
issue
1.77
issues
1.73
matters
1.68
matter
1.67
subject
1.64
issues
1.63
questions
1.54
question
1.52
Activations Density 0.568%