INDEX
Explanations
special characters within a text or code
patterns or sequences of backslashes and characters resembling coding or technical syntax
New Auto-Interp
Negative Logits
destro
-0.83
ascus
-0.78
assassins
-0.74
shot
-0.73
insult
-0.72
theless
-0.72
itcher
-0.70
therap
-0.69
undermin
-0.69
Palestin
-0.67
POSITIVE LOGITS
AppData
1.04
(\
1.03
wcsstore
1.00
circ
0.94
root
0.92
sq
0.91
gradient
0.91
framework
0.90
\'
0.89
bryce
0.87
Activations Density 0.005%