INDEX
Explanations
mentions of the word "Yo"
references to a specific individual named "Yo" and variations of that name
New Auto-Interp
Negative Logits
eele
-0.80
peat
-0.77
weight
-0.73
lear
-0.72
ortion
-0.70
ulence
-0.70
enment
-0.69
entimes
-0.69
acular
-0.68
ulent
-0.68
POSITIVE LOGITS
akov
0.79
nan
0.76
agi
0.75
nas
0.73
ahoo
0.73
Yo
0.73
Huh
0.72
Yan
0.71
Maker
0.70
orkshire
0.68
Activations Density 0.032%