INDEX
Explanations
words related to people or places, possibly focusing on specific individuals or locations
high-frequency occurrences of the substring "av"
New Auto-Interp
Negative Logits
Turing
-0.77
sofar
-0.69
utenant
-0.68
fog
-0.67
poppy
-0.65
tug
-0.65
unfor
-0.65
handy
-0.65
porous
-0.65
cold
-0.64
POSITIVE LOGITS
av
1.20
irus
1.11
oided
1.02
ascular
1.01
atars
1.00
ille
1.00
arro
1.00
itus
0.97
atar
0.97
irtual
0.96
Activations Density 0.009%