INDEX
Explanations
phrases or words related to specific locations or landmarks
proper nouns and locations
New Auto-Interp
Negative Logits
mechanically
-0.73
normalized
-0.68
PLAY
-0.67
AMERICA
-0.67
flaw
-0.66
fictitious
-0.65
predictable
-0.64
scratch
-0.64
brake
-0.64
frantic
-0.64
POSITIVE LOGITS
hai
1.36
ai
1.36
oa
1.33
onga
1.29
oi
1.26
aru
1.25
ku
1.24
ui
1.21
apa
1.21
wana
1.21
Activations Density 0.344%