INDEX
Explanations
physical connection or conversation
New Auto-Interp
Negative Logits
all
0.40
pel
0.38
గొప్ప
0.38
ຖືກ
0.38
직접
0.36
自分
0.36
વી
0.35
സ്വന്ത
0.35
spare
0.35
Два
0.35
POSITIVE LOGITS
consulted
0.45
consultas
0.44
kaj
0.41
iatan
0.40
konsult
0.39
Charles
0.39
कांटे
0.39
ള്
0.39
uhkan
0.39
Spokes
0.38
Activations Density 0.003%