INDEX
Explanations
phrases related to analyzing details or providing in-depth information
New Auto-Interp
Negative Logits
ICAN
-0.79
AIN
-0.76
Sov
-0.72
Mania
-0.68
Thieves
-0.68
Freak
-0.64
onso
-0.64
Pengu
-0.63
heim
-0.63
ains
-0.63
POSITIVE LOGITS
proximity
1.08
minded
0.93
resemblance
0.91
shave
0.86
paren
0.85
relatives
0.84
enough
0.81
ties
0.81
confid
0.81
relations
0.79
Activations Density 0.041%