INDEX
Explanations
the word "Aust" or "Austrian" at various activation levels
mentions of the word "Aust" in various contexts, particularly related to Austro-Hungarian references
New Auto-Interp
Negative Logits
SPONSORED
-0.85
displayText
-0.76
FTWARE
-0.71
MLG
-0.69
EngineDebug
-0.67
WAYS
-0.67
TED
-0.66
PATH
-0.66
HEAD
-0.65
çīĪ
-0.64
POSITIVE LOGITS
erity
1.21
inite
1.08
rika
1.05
Aust
1.04
rians
1.03
rator
1.01
ereo
1.01
rals
0.98
alis
0.97
rones
0.95
Activations Density 0.015%