INDEX
Explanations
expressions related to mathematical and computational functions
New Auto-Interp
Negative Logits
twimg
-0.57
שוליים
-0.50
ſelf
-0.45
%;
-0.44
ſta
-0.43
-};
-0.43
panel
-0.43
canary
-0.42
jPanel
-0.42
Gön
-0.42
POSITIVE LOGITS
indirectly
0.57
contentLoaded
0.54
indirec
0.48
through
0.47
indirect
0.46
overall
0.42
indirect
0.40
Indirect
0.40
angliski
0.40
Overall
0.40
Activations Density 1.078%