INDEX
Explanations
code snippets and descriptions
New Auto-Interp
Negative Logits
kai
0.67
suspicious
0.65
husband
0.63
perspective
0.62
cosplay
0.62
perspectives
0.61
absurd
0.61
punishment
0.61
controversy
0.61
fus
0.60
POSITIVE LOGITS
2.61
1.93
1.91
1.58
1.58
1.54
1.44
1.42
1.33
1.11
Activations Density 0.364%