INDEX
Explanations
proper nouns related to a specific person or location
occurrences of a specific term or phrase related to a category, possibly related to "ga" as a theme across various contexts
New Auto-Interp
Negative Logits
lessly
-0.83
TY
-0.72
tle
-0.71
paren
-0.71
CAST
-0.70
landish
-0.70
neys
-0.69
andem
-0.68
Topics
-0.68
ty
-0.66
POSITIVE LOGITS
vernment
1.03
veyard
0.94
ussian
0.83
ña
0.82
arb
0.81
Ga
0.81
arde
0.80
ffiti
0.80
uthor
0.76
xon
0.76
Activations Density 0.010%