INDEX
Explanations
strings related to application names
New Auto-Interp
Negative Logits
eka
-0.74
shots
-0.70
emaker
-0.67
pei
-0.66
manship
-0.65
culus
-0.64
ew
-0.63
jay
-0.63
states
-0.63
bullet
-0.63
POSITIVE LOGITS
ained
1.14
ainer
1.10
ategy
1.10
uments
1.01
aining
0.98
ategic
0.97
ains
0.95
anded
0.94
ader
0.91
icken
0.90
Activations Density 0.024%