INDEX
Explanations
references to mirrors and mirror-like properties or characteristics
New Auto-Interp
Negative Logits
himo
-0.81
препратки
-0.81
Ged
-0.69
flav
-0.68
lahan
-0.68
Manfred
-0.68
Manfred
-0.66
Sqft
-0.65
Fot
-0.65
Wils
-0.65
POSITIVE LOGITS
mirror
1.42
Mirrors
1.38
Mirror
1.36
mirrors
1.23
Mirrors
1.23
MIRROR
1.20
Mirror
1.16
mirror
1.09
Spiegel
1.02
miroir
0.98
Activations Density 0.005%