INDEX
Explanations
parts of descriptive phrases
New Auto-Interp
Negative Logits
patronage
0.51
Belle
0.46
mitigating
0.46
autres
0.46
ро
0.46
patrons
0.46
अन्याय
0.45
competency
0.44
belle
0.43
mercenaries
0.43
POSITIVE LOGITS
cB
0.51
تل
0.45
Steps
0.43
लाहि
0.42
السما
0.42
ted
0.42
txt
0.42
łącz
0.42
تل
0.41
ժ
0.41
Activations Density 0.000%