INDEX
Explanations
verbs related to actions and responsibilities
New Auto-Interp
Negative Logits
chrom
-0.79
ellen
-0.73
reddits
-0.72
Seym
-0.70
Pie
-0.66
earch
-0.65
Hill
-0.64
Skydragon
-0.64
ederation
-0.63
amine
-0.62
POSITIVE LOGITS
forward
0.87
loads
0.84
forward
0.82
weight
0.81
weight
0.78
onward
0.78
carrier
0.74
borne
0.73
loads
0.71
IGH
0.71
Activations Density 0.539%