INDEX
Explanations
phrases related to specific individuals, possibly in positions of power or influence
words and phrases related to the concept of "conditions"
New Auto-Interp
Negative Logits
kaya
-0.77
dfx
-0.73
cca
-0.70
ku
-0.67
ching
-0.67
ngth
-0.66
yrim
-0.66
doms
-0.65
DW
-0.65
chens
-0.64
POSITIVE LOGITS
ensed
0.94
itionally
0.92
itional
0.92
uctor
0.91
uit
0.85
itions
0.84
ega
0.82
uits
0.80
iments
0.76
uitous
0.76
Activations Density 0.028%