INDEX
Explanations
special characters or symbols
special characters or symbols
New Auto-Interp
Negative Logits
Lump
-0.72
Gators
-0.71
wart
-0.71
Tall
-0.69
Turtles
-0.68
Bengal
-0.68
ugg
-0.68
Brow
-0.67
hawks
-0.67
Dynam
-0.67
POSITIVE LOGITS
×Ļ×
2.03
×ķ
1.91
×
1.90
×Ļ
1.88
ש
1.81
×IJ
1.80
׾
1.79
×ŀ
1.79
ר
1.78
×Ķ
1.74
Activations Density 0.011%