INDEX
Explanations
access resources, action-oriented, story prompts
New Auto-Interp
Negative Logits
imperson
0.50
irrigation
0.50
Irrigation
0.50
polished
0.49
причиной
0.49
deformed
0.46
restored
0.44
poisoned
0.44
resurrected
0.44
polluted
0.43
POSITIVE LOGITS
acceler
0.49
osome
0.43
ka
0.42
ਆਪਣ
0.42
ingresso
0.42
जातात
0.41
對於
0.41
xb
0.40
သင်
0.40
तुम्ही
0.40
Activations Density 0.047%