INDEX
Explanations
references to mirrors and their properties
New Auto-Interp
Negative Logits
первых
-0.88
chargez
-0.75
leeftijd
-0.74
Normdatei
-0.71
arakhand
-0.68
jme
-0.64
Kdo
-0.64
Ừ
-0.64
shund
-0.63
hvem
-0.61
POSITIVE LOGITS
mirror
2.06
Mirror
2.04
Mirrors
1.95
mirrors
1.93
MIRROR
1.81
Mirror
1.79
Mirrors
1.76
mirror
1.75
mirroring
1.49
mirrored
1.47
Activations Density 0.100%