INDEX
Explanations
references to object-oriented programming concepts, particularly related to 'self'
New Auto-Interp
Negative Logits
Theſe
-0.97
katze
-0.89
Beſ
-0.88
ſeveral
-0.88
Reſ
-0.84
PhysRevD
-0.83
Efq
-0.82
Monfieur
-0.82
Eſ
-0.81
withstanding
-0.78
POSITIVE LOGITS
self
2.13
Self
1.61
SELF
1.36
Self
1.25
Selbst
1.03
само
0.94
Selbst
0.91
αυτο
0.89
自我
0.81
SELF
0.73
Activations Density 0.115%