INDEX
Explanations
phrases or names starting with "Jo"
the presence of the name "Jo" in various contexts
New Auto-Interp
Negative Logits
IFIED
-0.69
interests
-0.68
Fired
-0.66
flies
-0.66
enegger
-0.64
raits
-0.63
grievance
-0.63
amplification
-0.62
CLASS
-0.61
BALL
-0.61
POSITIVE LOGITS
aquin
1.37
Jo
1.25
zeb
1.06
ining
1.06
jo
1.04
ppa
1.03
anne
1.02
zy
1.02
Anne
0.96
zzle
0.95
Activations Density 0.028%