INDEX
Explanations
phrases related to a specific person or name, possibly focusing on political figures or events
proper nouns, specifically names and titles
New Auto-Interp
Negative Logits
ctors
-0.67
FIRE
-0.67
glers
-0.66
Flavoring
-0.64
inki
-0.63
blunt
-0.60
vae
-0.59
slash
-0.58
flames
-0.58
magnification
-0.58
POSITIVE LOGITS
ettes
0.72
Lauder
0.70
enne
0.68
steen
0.68
ette
0.67
idge
0.67
Pwr
0.65
endment
0.65
olla
0.65
ISO
0.64
Activations Density 0.135%