INDEX
Explanations
specific names, specifically "Lucas," with varying intensity
references to the name "Lucas."
New Auto-Interp
Negative Logits
ãĥĥãĥĪ
-0.70
req
-0.69
reconc
-0.67
à¨
-0.66
BOOK
-0.65
eering
-0.63
ifice
-0.63
lying
-0.63
ħĭ
-0.63
lain
-0.61
POSITIVE LOGITS
film
1.26
Film
1.10
Lucas
0.93
Skywalker
0.93
ious
0.87
ifer
0.84
avin
0.78
Hunt
0.76
afort
0.76
eland
0.76
Activations Density 0.007%