INDEX
Explanations
phrases or sentences that contain contrasting elements or ideas
New Auto-Interp
Negative Logits
Travels
-0.72
activ
-0.61
culminated
-0.57
conven
-0.57
culmin
-0.57
aw
-0.56
personalized
-0.56
redes
-0.55
accelerated
-0.55
ize
-0.54
POSITIVE LOGITS
nor
1.89
nor
1.60
Nor
1.56
Nor
1.31
Instead
1.30
yet
1.25
Instead
1.23
Neither
1.19
unless
1.16
neither
1.16
Activations Density 3.393%