INDEX
Explanations
language related to government policies and regulations
New Auto-Interp
Negative Logits
Bos
-0.73
Salam
-0.69
Yao
-0.68
Pony
-0.66
Towers
-0.65
Crusader
-0.65
Typhoon
-0.64
Rocket
-0.64
Turtle
-0.63
Droid
-0.63
POSITIVE LOGITS
rogens
1.06
rogen
1.03
nurture
0.84
manipulate
0.84
rew
0.84
refine
0.82
distribute
0.81
execute
0.79
/+
0.78
analyze
0.78
Activations Density 3.109%