INDEX
Explanations
details or elements that are closely related or associated with each other
phrases indicating proximity or detailed attention
New Auto-Interp
Negative Logits
ICAN
-0.79
Chaser
-0.71
Mania
-0.70
ule
-0.68
————————
-0.64
Bucket
-0.63
Jackets
-0.63
Mayhem
-0.63
Reasons
-0.62
ista
-0.62
POSITIVE LOGITS
aligned
0.96
minded
0.87
wired
0.85
resembles
0.83
enough
0.83
minded
0.82
closely
0.82
connected
0.81
intertwined
0.81
tuned
0.80
Activations Density 0.010%