INDEX
Explanations
references to quality or refinement
New Auto-Interp
Negative Logits
ulhu
-0.82
raltar
-0.76
jri
-0.75
osponsors
-0.74
Kut
-0.72
thwarted
-0.72
ataka
-0.72
rush
-0.70
soDeliveryDate
-0.70
urg
-0.69
POSITIVE LOGITS
tuning
1.02
tuned
0.93
Gael
0.90
arts
0.87
linen
0.82
tune
0.82
dining
0.80
0.78
art
0.76
vers
0.72
Activations Density 0.011%