INDEX
Explanations
mentions of a specific name "Naruto"
mentions of the term "Naruto"
New Auto-Interp
Negative Logits
ILCS
-0.99
mable
-0.76
quarters
-0.73
boards
-0.73
strength
-0.73
drawn
-0.71
values
-0.71
blank
-0.71
med
-0.70
power
-0.70
POSITIVE LOGITS
uto
1.23
ÄŁ
1.00
Uz
0.96
zza
0.87
ctors
0.84
zzi
0.84
eca
0.78
rio
0.77
zin
0.75
á¹
0.75
Activations Density 0.021%