INDEX
Explanations
references to "legend" or related concepts in narratives or descriptions
New Auto-Interp
Negative Logits
nahilalakip
-0.75
Ril
-0.72
Bix
-0.69
whor
-0.68
Fli
-0.64
setShow
-0.63
drew
-0.62
opsida
-0.60
Twee
-0.60
Tf
-0.59
POSITIVE LOGITS
legend
1.43
Legend
1.30
Legends
1.18
Legend
1.16
LEGEND
1.16
LEGEND
1.13
legend
1.13
Legends
1.11
legends
1.07
legends
0.98
Activations Density 0.006%