INDEX
Explanations
mentions of the word "jeep"
the word "je" in various contexts
New Auto-Interp
Negative Logits
work
-0.67
onics
-0.66
urate
-0.66
under
-0.66
power
-0.64
Shards
-0.63
Beacon
-0.63
aster
-0.62
ignt
-0.61
coin
-0.61
POSITIVE LOGITS
je
4.03
Je
1.81
je
1.67
Jeep
1.61
Je
1.37
boo
1.20
vill
1.18
sne
1.12
ja
1.09
ber
1.03
Activations Density 0.011%