INDEX
Explanations
"when" or "what" followed by auxiliary verbs or nouns
New Auto-Interp
Negative Logits
target
0.48
sapphire
0.44
tariff
0.42
beam
0.40
target
0.39
pair
0.39
relies
0.38
chromatic
0.38
economic
0.38
similar
0.38
POSITIVE LOGITS
Porque
0.57
Quando
0.56
Τα
0.53
esistono
0.52
мама
0.50
.".,
0.50
𝐄
0.50
mamá
0.49
पहले
0.49
quando
0.49
Activations Density 0.000%